Html.Table

Html.Table

D

The M Code Behind the Power Query M function Html.Table

To better understand how the Html.Table function works, it is important to dive into the M code that powers it. In this article, we will explore the M code behind the Html.Table function and provide examples of how it can be used to extract data from HTML tables.

What is the Html.Table function?

The Html.Table function is a built-in function in Power Query that allows users to extract tables from HTML pages and convert them into usable data structures. This function takes a single input parameter, which is the HTML content of the webpage, and outputs a table containing the data from the HTML table.

How does the Html.Table function work?

The Html.Table function works by parsing the HTML content of a webpage and identifying the table elements within the HTML. It then extracts the data from these tables and converts them into a structured table format that can be used in data analysis.

The M code behind the Html.Table function is responsible for performing this parsing and extraction process. The code uses a combination of built-in functions and custom functions to extract the data from the HTML tables and transform it into a usable format.

Exploring the M code behind the Html.Table function

The M code behind the Html.Table function can be broken down into several distinct steps, including:

Step 1: Retrieving the HTML content of the webpage

The first step in the M code behind the Html.Table function is to retrieve the HTML content of the webpage. This is typically done using the Web.Page function, which retrieves the HTML content of a webpage and returns it as text.

Step 2: Parsing the HTML content

Once the HTML content has been retrieved, the M code behind the Html.Table function must parse the HTML and identify the table elements within it. This is typically done using the Xml.Tables function, which converts the HTML content into an XML format and identifies all of the table elements within it.

Step 3: Extracting table data

Once the table elements have been identified, the M code behind the Html.Table function must extract the data from each table and convert it into a usable format. This is typically done using a combination of built-in functions and custom functions that are designed to extract specific types of data from HTML tables.

Step 4: Combining table data

Once the data has been extracted from each table, the M code behind the Html.Table function must combine it into a single table structure. This is typically done using the Table.Combine function, which combines multiple tables into a single table structure.

Step 5: Cleaning and transforming the data

Once the data has been combined into a single table structure, the M code behind the Html.Table function must clean and transform the data to make it usable in data analysis. This is typically done using a combination of built-in functions and custom functions that are designed to clean and transform specific types of data.

Using the Html.Table function in Power Query

Now that we have explored the M code behind the Html.Table function, let’s take a look at some examples of how it can be used in Power Query.

Example 1: Extracting data from a single HTML table

The following M code demonstrates how to extract data from a single HTML table using the Html.Table function:


let

html = Web.Page(“https://www.example.com/table.html”),

table = html{0}[Data],

data = Table.FromColumns(table, {“Column1”, “Column2”, “Column3”})

in

data


This code retrieves the HTML content of the webpage located at https://www.example.com/table.html, identifies the first table element within it, extracts the data from the table, and converts it into a single table structure with columns named Column1, Column2, and Column3.

Example 2: Extracting data from multiple HTML tables

The following M code demonstrates how to extract data from multiple HTML tables using the Html.Table function:


let

html = Web.Page(“https://www.example.com/tables.html”),

tables = html{0}[Data],

data = Table.Combine(List.Transform(tables, each Table.FromColumns(_, {“Column1”, “Column2”, “Column3”})))

in

data


This code retrieves the HTML content of the webpage located at https://www.example.com/tables.html, identifies all of the table elements within it, extracts the data from each table, combines the data into a single table structure, and cleans and transforms the data to make it usable in data analysis.

The Html.Table function in Power Query is a powerful tool for extracting data from HTML tables and converting it into a usable format for data analysis. By understanding the M code behind this function, you can better understand how it works and how to use it in your own data analysis projects.

Power Query and M Training Courses by G Com Solutions (0800 998 9248)

Upcoming Courses

Contact Us

Subject

Your Name (required)

Company/Organisation

Email (required)

Telephone

Training Course(s)

Your Message

Upload Example Document(s) (Zip multiple files)

Copyright, G Com Solutions Ltd, 2024.
Tower Bridge Business Centre, 46-48 East Smithfield, London E1W 1AW
0800 998 9248   |   9:00 a.m. till 5:30 p.m.

Connect With Me: