Learning Tableau 10(Second Edition)
上QQ阅读APP看书,第一时间看更新

Connecting to data

There is virtually no limit to the data Tableau can visualize. Each successive version of Tableau adds new native connections. Tableau continues to add native connectors for cloud-based data and recently included the Web Data Connector, which allows you to custom-build a connector for any online data you wish to retrieve. Additionally, for any database without a native connection, Tableau gives you the ability to use a generic ODBC connection. The Extract API allows you to programmatically extract and combine any data source(s) for use in Tableau.

You may have multiple data source connections to different sources in the same workbook. Each connection will show up under the Data tab on the left sidebar.

This section will focus on a few practical examples of connecting to various data sources. We won't cover every possible connection, but we will cover several that are representative of others. You may or may not have access to some of the data sources in the following examples. Don't worry if you aren't able to follow each example. Merely observe the differences.

Note

In Tableau 10, a connection technically refers to the connection made to a single set of data, such as table(s) in a single database or file(s) in a directory. A data source may contain more than one connection which can be joined together, such as a table in SQL Server joined to an Excel table. You may often hear these terms used interchangeably.

Connecting to data in a file

File-based data includes all sources of data where the data is stored in a file. File-based data sources include:

  • Tableau data extract: A .tde file containing data that was extracted from an original source. When you connect to an extract, the connection retains information about the extract, but not about the original source.
  • Microsoft access: A .mdb or .accdb database file created in Access.
  • Microsoft excel: A .xls, .xlsx, or .xlsm spreadsheet created in Excel. Multiple Excel sheets or sub tables may be joined or unioned together in a single connection.
  • Text file: A delimited text file, most commonly .txt, .csv, or .tab. Multiple text files in a single directory may be joined or unioned together in a single connection.
  • Statistical file: A .sav, .sas7bdat, .rda, or .rdata file generated by statistical tools such as SAS or R.
  • Other files: Select this option to connect to any file data source. In addition to those mentioned previously, you can connect to Tableau files to import connection you have saved in another Tableau workbook (.twb or .twbx). The connection will be imported and changes will only affect the current workbook.

Follow this example to see a connection to an Excel file:

  1. Navigate to the Connect to Excel sheet in the Chapter 02 Starter.twbx workbook.
  2. From the menu, navigate to Data | Create new data source and select Excel from the list of possible connections.
  3. In the open dialogue, open the Superstore.xlsx file from the \Learning Tableau\Chapter 02\ directory. Tableau will open the Data Source screen. You should see the two sheets of the Excel document listed on the left.
  4. Double-click the Orders sheet and then the Returns sheet. Your Data Source screen should look similar to this:

Take some time to familiarize yourself with the Data Source screen interface, which has the following features (numbered earlier):

  1. Toolbar: The toolbar has a few of the familiar controls, including undo, redo, and save.
  2. Connections: All the connections in the current data source. Click Add to add a new connection to the current data source. This allows you to join data across different connection types. Each connection will be color coded so you can distinguish which data is coming from which connection.
  3. Sheets (Tables):This lists all the tables of data available for a given connection. This includes sheets, sub tables, and named ranges for Excel, tables, views, and stored procedures for relational databases, as well as other connection-dependent options such as New Union or Custom SQL.
  4. Data source name: This is the name of the currently selected data source (in the dropdown). You may click this name to change it. You may also click the down arrow on the database icon to change which data source you wish to edit.
  5. Connection editor: Drop sheets and tables from the left in this area to make them part of the connection. For many connections, you may add multiple tables, which will be joined or unioned together. We'll take a look at some advanced examples of options later in the chapter. For now, notice that you can hover over tables in this space and get options via a dropdown.
  6. Live or Extract options: For many data sources, you may choose whether you would like to have a live connection or an extracted connection. We'll look at details later in the chapter.
  7. Data source filters: You may add filters to the data source. These will be applied universally.
  8. Preview pane options: These options allow you to specify whether you'd like to see a preview of the data or a list of metadata, and how you would like to preview the data (for example, with alias values, hidden fields shown, and how many rows you'd like to preview).
  9. Preview pane / Metadata view: Depending on your selection in the options, this space either displays a preview of data or a list of all fields with additional metadata. Notice that these views give you a wide array of options such as changing data types, hiding or renaming fields, and applying various data transformation functions. We'll consider some of these options in this and later chapters.

Once you have created and configured your data source, you can click on any sheet to start using your data source.

Conclude this exercise with the following steps:

  1. Click on the data source name to edit the text and rename the data source as Orders and Returns.
  2. Navigate to the Connect to Excel sheet and use the Orders and Returns data source to create a time series showing Profit by Quarter.

If you need to edit the connection at any time, select Data from the menu, locate your connection, and then select Edit DataSource... Alternately, you can right-click on any data source under the Data tab on the left hand bar and select Edit Data Source or click on the Data Source tab in the lower-left corner.

Tip

Prior to version 8.2, Tableau used the Microsoft JET driver to connect to Access, Excel, and Text files. Tableau now uses a new connection that avoids limitations that were present in the JET driver, but also removes the option for writing custom SQL. If you need to use the legacy connection, you can select that option using the drop-down arrow on the Open File dialog:                                                                                          

When using legacy connections, you will lose support for some aggregations, functions, and cross-database joins. Additionally, this option is not available for the Mac version.

Connecting to data on a server

Database servers, such as SQL Server, MySQL, Vertica, and Oracle, host data on one or more server machines and use powerful database engines to store, aggregate, sort, and serve data based on queries from client applications. Tableau can leverage the capabilities of these servers to retrieve data for visualization and analysis. Alternately, data can be extracted from these sources and stored in a Tableau Data Extract (.tde).

As an example of connecting to a server data source, we'll look at connecting to SQL Server. If you have access to a server-based data source, you may wish to create a new data source and explore the details. However, there is no specific example to follow in the chapter 02 workbook. As soon as the Microsoft SQL Server connection is selected, the interface displays options for some initial configuration. These are shown in the following screenshot:

A connection to SQL Server requires the Server name, as well as authentication information. A database administrator can configure SQL Server to use Windows Authentication or a SQL Server username and password. With SQL Server, you can also allow for reading uncommitted data. This can potentially improve performance, but may also lead to unpredictable results if data is being inserted, updated, or deleted at the same time Tableau is querying. Additionally, you may specify SQL to be run at connect time using the Initial SQL... link in the lower-left corner.

Note

In order to maintain high standards of security, Tableau will not save a password as a part of a data source connection. This means that if you share a workbook using a live connection with someone else, they will need to have credentials to access the data. This also means that when you first open the workbook, you will need to re-enter your password for any connections requiring a password.

Once you click the orange Sign In button, you will see a screen that is similar to the connection screen you saw for Excel. The main difference is on the left, where you have an option for selecting a database:

Once you have selected a database, you will see the following:

  • Tables: Any data tables or views contained in the selected database.
  • New Custom SQL: You may write your own custom SQL scripts and add them as tables. You may join these as you would for any other table or view.
  • Stored Procedures: You may use a stored procedure that returns a table of data. You will be given the option of setting values for stored procedure parameters or using or creating a Tableau parameter to pass values.

Once you have finished configuring the connection, click on any sheet to begin visualizing the data.

Connecting to data in the cloud

Certain data connections are made to data that is hosted in the cloud. These include Amazon Redshift, Google Analytics, Google Sheets, Salesforce, and many others. It is beyond the scope of this book to cover each connection in depth, but as an example of a cloud data source, we'll consider connecting to Google Sheets.

Google Sheets allows users to create and maintain spreadsheets of data online. Sheets may be shared and collaborated on by many different users. Here, we'll walk you through an example of connecting to a sheet that is shared via a link.

To follow the example, you'll need a free Google account. With your credentials, follow these steps:

  1. Click on the Add new data source button on the toolbar:
  2. Select Google Sheets from the list of possible data sources. You can use the search box to quickly narrow the list.
  3. On the next screen, sign in to your Google account and allow Tableau Desktop the appropriate permissions. You will then be presented with a list of all your Google Sheets, along with preview and search capabilities:
  4. Enter this URL (for convenience it is included in the Chapter 02 Starter workbook and may be copied and pasted) into the search box and click the Search button: https://docs.google.com/spreadsheets/d/171HdLYlQYSo66_VVbIe1R0UNbggXK6AHOD2gUB1Ce-0/edit?usp=sharing.
  5. Select the resulting Superstore sheet in the list and then click the Connect button. You should now see the Datasource screen.
  6. Rename the data source as Superstore (Google Sheets).
  7. Switch the connection option from Live to Extract.
  8. Navigate to the Connect to Google Sheets sheet. The data should be extracted within a few seconds.
  9. Create a filled map of Profit by State with Profit defining the Color and the Label.

Shortcuts for connecting to data

You can make certain connections quickly. These options will allow you to start doing analysis more quickly:

  • Paste data from the clipboard: If you have copied data from a spreadsheet, a table on a webpage, or a text file, you can paste the data directly into Tableau. This can be done using Ctrl + V or Data | Paste Data from the menu. The data will be stored as a file and you will be alerted of its location when you save the workbook.
  • Select File | Open from the menu: This will allow you to open any data file that Tableau supports, such as text files, Excel files, Access files (not available on Mac), and even offline cube (.cub) files.
  • Drag and drop a file from Windows Explorer onto the Tableau workspace: Any valid file-based data source can be dropped onto the Tableau workspace, or even the Tableau shortcut on your desktop or taskbar.
  • Duplicate an existing connection: You can duplicate an existing data source connection by right-clicking and selecting Duplicate.