What and Why?
Before showing steps you can upgrade you TabPy to the latest pre-release version some terminology needs to be explained (all the explanations are generalized and are not exact):
- TabPy release – a packaged snapshot of TabPy sources and artifacts with a tag (version) attached to it. You don’t need to use any of the releases in any way unless you are a contributor to TabPy. Releases are created when some feature, improvement or significant bug fix is added to the codebase. All releases can be seen at https://github.com/tableau/TabPy/releases.
- Pre-release – specific version for which all quality confirmation hasn’t been done yet. You can see some releases are marked as Pre-release and some other as Verified.
- TabPy package – TabPy release (not exactly, but very close to it) published on pypi.org. All the packages from the site can be installed on user environment with pip install <package_name> command.
- TabPy pre-release packaged is published on test.pypi.org instead but is technically the same package which later when promoted to Verified release is published on pypi.org.
Why would you want or need to install or upgrade your TabPy to pre-release package? The only two reasons are:
- There is a feature you need, or
- There is a bug fix you need.
However, some warning has to be made:
- Pre-release packages are not tested to the full and only meant to be used to unblock a scenario with a new feature or a bug fix.
- Pre-release packages won’t necessarily be approved ever and can be recalled.
- Pre-release packages may contain breaking and incompatible changes.
- Upgrading from a pre-release package to a release may not be possible.
- There may not be any bug fixes, improvement or investigations on pre-release packages.
Actually installing or upgrading to a pre-release package is as simple as specifying repository with
pip command parameter. For test.pypi.org the parameter is
-i https://test.pypi.org/simple/. And the whole command to install the latest pre-release package is
pip install --upgrade -i https://test.pypi.org/simple/ tabpy
And you can specify the exact version of a package, e.g.
pip install --upgrade -i https://test.pypi.org/simple/ tabpy==0.8.13
If the pre-release version works for you – keep using it until you need to upgrade for the newer version. You can upgrade (or rather try to upgrade – see the warnings above) to a newer pre-release package as shown in the previous section. Or you can even upgrade to the newer “official” package with the regular
pip install --upgrade tabpy command.
In case when the upgrade is not possible for any reason you’ll need to uninstall your current TabPy with
pip uninstall tabpy command and simply install it again.
Did you know TabPy has some data science models ready to use which are installed in your Python environment as a part of TabPy package?
But first what are TabPy models? Those simply are Python functions “preserved” in TabPy and available for being used in Tableau scripts. Here’s an explanation for how to deploy a function into TabPy – Deploying a Function. And this page shows how to use deployed functions in Tableau calculations – Using Deployed Functions.
Mentioned above documentation and examples should be enough for you to start on creating, deploying and using TabPy models (or deployed functions if you prefer that term).
As I mentioned above TabPy ships with some models which only need to be deployed. And deployment for them is as easy as running
tabpy_deploy_models command in your terminal window after installing TabPy package. All the models are deployed at once. Remember you need TabPy running for the models to be deployed.
The following models are available at the moment I am writing this text:
- Principal Component Analysis (PCA).
- Sentiment Analysis.
- Analysis of Variants (ANOVA).
The explanation for each of the models and how to invoke them in Tableau calculations can be found at the Predeployed Functions page.
TabPy is an open-source Python web server which is used for extending Tableau calculations or Data Prep data processing with Python scripts. You can read more about how to use TabPy with Tableau on Using Python in Tableau Calculations page or on Tableau blog Building advanced analytics applications with TabPy, or even with Building Data Science Applications with TabPy Video Tutorial.
When you just start using TabPy and search for information on how to install, configure and use it you may find a lot of articles and blog posts that are somewhat contradictory to each other and outdated. And you will find people explaining how to install and use TabPy, deploy models to it and use its other features mentioning
tabpy_tools and maybe even some other buzzwords. So what are those and what are the actual steps?
First, let me tell you the most recent and updated steps for how to install TabPy can be found on the project GitHub page TabPy Installation Instructions. As you can see on the page installing TabPy as simple as running this one command:
pip install tabpy. With the command, you now have the latest approved TabPy package installed in your Python environment. And to run it simply execute
Now, what are those
tabpy_client you may find mentions about? Those are old versions of TabPy when it was split into 2 packages. Neither of those is recommended to be used anymore. And if you have them installed you should delete them.
To give you more information long ago before TabPy became a package it was built as an application (
tabpy_server) and a library (
tabpy_client) which were distributed as source code via GitHub. But those days gone and you don’t need to clone GitHub repository, configure your environment variables, run setup/startup scripts and perform other black magic steps anymore.
Summary: you only need
tabpy package, ignore all the posts and articles where
tabpy_tools are mentioned as obsolete.