Pdf.Tables

D

T

The M Code Behind the Power Query M function Pdf.Tables

Behind the scenes, the Pdf.Tables function is powered by M code, the programming language used by Power Query. In this article, we will take a closer look at the M code behind the Pdf.Tables function and explore how it works.

Understanding the Pdf.Tables Function

Before delving into the M code behind the Pdf.Tables function, it’s important to understand what the function does. Essentially, the function takes a PDF file as input and returns a table of data that represents the tables contained within the PDF.

The function has three parameters:

1. Source: This parameter specifies the path to the PDF file that you want to extract tables from.

2. Option1: This parameter is optional and can be used to specify additional options for extracting tables from the PDF. For example, you can use this parameter to specify the page range that you want to extract tables from.

3. Option2: This parameter is also optional and can be used to specify additional options for extracting tables from the PDF. For example, you can use this parameter to specify the password for a password-protected PDF file.

The M Code Behind the Pdf.Tables Function

Now that we understand what the Pdf.Tables function does, let’s take a look at the M code that powers it. When you use the Pdf.Tables function in Power Query, the function generates M code that looks something like this:


let

Source = Pdf.Tables(“C:UsersUserNameDocumentsSample.pdf”),

#”Table1″ = Source{0}[Table]

in

#”Table1″


Let's break down this code line by line to understand what's going on:

1. let: This keyword is used to declare a variable in M code. In this case, we are declaring a variable called "Source".

2. Source: This is the name of the variable we are declaring. We are using the Pdf.Tables function to extract tables from the PDF file located at "C:UsersUserNameDocumentsSample.pdf".

3. #"Table1": This is a new variable that we are declaring. We are assigning this variable the value of the first table found in the PDF file.

4. Source{0}: This syntax is used to access the first item in the "Source" variable, which is the table of data returned by the Pdf.Tables function.

5. [Table]: This syntax is used to access the "Table" property of the data returned by the Pdf.Tables function. This property contains the actual table data.

6. in: This keyword is used to indicate the end of the "let" statement and the beginning of the actual code that will be executed. In this case, we are returning the table of data stored in the "Table1" variable.

Customizing the Pdf.Tables Function

The Pdf.Tables function is a powerful tool for extracting tables from PDF files, but it's not always perfect. Depending on the structure and formatting of the PDF file, the function may not be able to extract all of the tables correctly.

Fortunately, you can customize the M code behind the Pdf.Tables function to make it work better for your specific needs. For example, you can use the "Option1" and "Option2" parameters to specify additional options for extracting tables from the PDF.

Additionally, you can modify the "Table1" variable to extract different tables from the PDF file. For example, if you want to extract the second table from the PDF file instead of the first table, you can modify the code like this:


let

Source = Pdf.Tables(“C:UsersUserNameDocumentsSample.pdf”),

#”Table2″ = Source{1}[Table]

in

#”Table2″


In conclusion, the Pdf.Tables function in Power Query is a powerful tool for extracting tables from PDF files. Behind the scenes, the function is powered by M code, which can be customized to meet your specific needs. Whether you're working with financial reports, scientific research papers, or other types of PDF files, the Pdf.Tables function can help you extract the data you need quickly and easily.

Power Query and M Training Courses by G Com Solutions (0800 998 9248)

Upcoming Courses

Contact Us

    Subject

    Your Name (required)

    Company/Organisation

    Email (required)

    Telephone

    Training Course(s)

    Your Message

    Upload Example Document(s) (Zip multiple files)