Jun 15, 2021Slots to the (Wallet) Rescue | Alessandro PuccettiAt Huq Industries, we collect and process 30 billions geo-events a month from more than 8 million devices. At the time of writing we collected and enriched more than 300 billions events all stored in BigQuery table for a total of about 85 TB. In this blog post, I will…Bigquery6 min read
Apr 13, 2021BigQuery Authorized Views with TerraformIn one of my previous blog posts, we have seen how, at Huq Industries, we used authorised views to reduce costs, complexity, and delivery time. You can read more about it here. In this post, we will see how to implement authorised views in production and managing them as code…Bigquery2 min read
Mar 31, 2021Real-Time data delivery at scale with BigQueryOriginally published at https://www.puccetti.io At Huq Industries, we collect and process 30 billions events a month from more than 8 million devices. Every day, we enrich, slice, and then deliver data feeds in various forms to our clients mainly via BigQuery, GCS, or S3. In the past 2 years, we have…Bigquery4 min read
Mar 9, 2021Time Travel with BigQueryHow to use BigQuery superpowers to rewind time. Originally published at https://www.puccetti.io Who does not like time travel? We all saw it and were fascinated by it in many sci-fi movies, unfortunately science did not crack real-life time travel yet. However, we can “data time-travel”. Thanks to the amazing BigQuery…Bigquery2 min read
Published in The Startup·May 10, 2020Serverless S3 Access Logs Analytics using BigQuery.Originally published at https://www.puccetti.io on May 10, 2020. Do you use BigQuery? Are you interested in knowing how to integrate data from different cloud providers into BigQuery? In this blogpost, we will implement a serverless and fully managed system to make available S3 access logs into BigQuery to easily integrate…Bigquery6 min read
Jan 30, 2020BigQuery Geography ClusteringThe BigQuery team rolled out support for geography type a while ago and they have never stopped improving performances and GIS (Geographic Information System) functions. This allows users to run complex geo-spatial analytics directly in BigQuery harnessing all its power, simplicity, and reliability. Hold on your keyboard (or your screen…Bigquery2 min read
Published in Analytics Vidhya·Jan 21, 2020BigQuery Partitioning & ClusteringIn this blogpost, I will explain what partitioning and clustering features in BigQuery are and how to supercharge your query performance and reduce query costs. Partitioning Partitioning a table can make your queries run faster while spending less. Until December 2019, BigQuery supported table partitioning only using date data type. Now…Bigquery5 min read
Jan 6, 2020BigQuery WildcardsBigQuery supports the “*” wildcard to reference multiple tables or files. You can leverage this feature to load, extract, and query data across multiple sources, destinations, and tables. Let’s see what you can do with wildcards with some examples. The first thing is definitely loading the data into BigQuery. If…Bigquery2 min read
Aug 1, 2019A Journey Through Big Data Using Google BigQueryBack in the early days of Huq we were ingesting a just few millions records per day into our geo-behavioural insights platform. Today that figure is in the hundreds of millions. During the period where our traffic was ramping intensively, we quickly realised that our single high-spec bare metal server…Big Data3 min read
Jul 15, 2019Google Cloud Composer: Overcoming The Short-living Tasks ProblemAuthor’s Github and Twitter. Originally posted on Huq Industries tech blog. Introduction Google Cloud Composer is Google Cloud Platform product that helps you manage complex workflows with ease. It is built on top of Apache Airflow and is a fully managed service that leverages other GCP products like Cloud SQL, GCS…Google Cloud Platform5 min read