Blog Layout

MAY CONTAIN BAD ASS CONTENT: ANALYST DISCRETION IS ADVISED

Getting Started with Snowflake Data Exchange

James Smith • August 12, 2020

Get Live Data from Snowflake Data Exchange

In this post, we will run through how to connect your existing Snowflake account to the Snowflake data exchange or set up a trial so you can immediately start accessing live external data for your analytics.

Setup a new account:

If you already have access to a Snowflake account, congratulations!  You can skip these steps.  If not, go to trial.snowflake.com.   On the below page, indicate your edition (Enterprise is the correct choice for 95% of applications) and your chosen cloud.  Within minutes, you will receive an email with your account id.  It looks something like XX1234-us-west.azure.snowflake.com. When you activate the link, create your account admin username and password, and you are ready to go. 



Once your account is setup, on the worksheet tab, my recommendation is to immediately execute the following code:


CREATE WAREHOUSE MY_COMPUTE_WAREHOUSE;

ALTER WAREHOUSE MY_COMPUTE_WAREHOUSE SET WAREHOUSE_SIZE = "X-Small", AUTO_SUSPEND = 60;


This will create a brand new compute warehouse and set it to the smallest size with the minimum auto-off (after 60 seconds).  Unless you are dealing with huge data sets, for the type of investigation we will be doing these settings will ensure you get the most of your $400 in free credit (and if you stick to an extra-small warehouse you will easily get a full month of querying in!)

Connect and Search the Data Exchange

From your main page,  just hit the marketplace logo and begin to search through all the data providers that are available.  

From here, make sure you have selected account admin role and select your listing to subscribe to it.  Note that standard listings are available immediately.  You may need to set the role to ACCOUNTADMIN to subscribe.  Subscribing to a personalized listing requires some interaction with the provider and may or may not have a separate subscription fee.   In the image below ,  we are going to use this data set provided by Safegraph to power some of next steps.     

Once you have done that,  back in Snowflake you will immediately see all the tables in a new database in Snowflake (you may have to refresh.)

Then in line with best practice, run this line of code to allow for a non accountadmin to use the database.   (It is advised that you limit the number of people who use accountadmin role and do not use it as a role for querying data.)


GRANT IMPORTED PRIVILEGES ON DATABASE "SAFEGRAPH_AZURE_WESTEUROPE_SAFEGRAPH_CENSUS_DATA" TO ROLE SYSADMIN 

From here, you are ready to query this data,  however as we believe in no code/low code, the next post will show you how to connect to this data from Alteryx and the fun will begin!



December 17, 2020
Anyone who follows me knows how much I love the combination of Alteryx and Snowflake. Alteryx provides a logic layer that enables any analyst to manipulate data and Snowflake provides the scale needed to get some amazing insights across large data sets. However, I have found there were always a few things missing. The Alteryx In-DB tools, while comprehensive, cannot do everything. There are some gaps (most notably - ahem - any Alteryx product managers reading this - simple Pivot or Unpivot tools in-DB). There is also a lot of functionality like changing warehouses, schemas or databases that cannot be easily done from within Alteryx.
November 2, 2020
In our previous post, we discussed why d emand forecasting matters and how an accurate demand forecast can mathematically and quantitatively improve your service level and/or inventory levels (cost). Now it is time to focus on the biggest uncertainty of them all . Time. Advanced forecasting algorithms can help eliminate uncertainty through reducing variables that are “unknown” by understanding relationships at a scale that no human could comprehend. This helps turn the seemingly “unknown” in to a “known”. As we stated earlier, advances in math and cloud computing make finding these relationships much easier and can be computed almost instantaneously. However, the element of time is a tricky compounding factor. Even little unknowns become big over time as time compounds all errors and introduces more uncertainties. The most certainty you have is in the precise moment you make an estimate. When you look ahead 1 day, 1 week or even 1-month (as most corporate S&OP) cycles still do – then you must start to rely less and less on facts and more and more on assumptions. Assumptions mean uncertainty and uncertainty means cost. There are loads of examples of this. When I lived in Florida, every September during hurricane season , we would get the dreaded cone of uncertainty any time a hurricane formed. As below shows , the further out you get, the expected position gets wider. By day 5 - the uncertainty stretched over 500 miles – hardly accurate enough to do any planning of significance  .
By James Smith September 14, 2020
You cannot go far these days without hearing about data science, digital transformation or AI. This is especially true in the supply chain discipline. However, when taking it back to basics, what AI, ML and Data Science are really doing are solving mathematical equations that are hardly new. These equations are just trying to optimize a series of variables to minimize cost and maximize service – something supply chain professionals have been doing for decades. So what has changed? Cloud Computing and Cloud Storage! Cloud Computing is enabling supply chain professionals to perform more math, faster. Cheap Storage is providing the ability to store more data about products and environmental variables which gives more fuel for the math to solve.
Share by: