How To Reduce the Disk Space Need for Amazon Redshift — Part 2

How To?

  • Export a Redshift table to S3 (CSV)
  • Convert exported CSVs to Parquet files in parallel
  • Create the Spectrum table on your Redshift cluster
  • Ensure that we have psql installed. Thats needed so that we can execute the commands on Redshift database. Install it if required.
  • Create the schema if required. It needs to be done only once. We will host our spectrum tables in spectrum schema.
  • Check if spectrify is installed. Install it with all the necessary dependencies, if required.
  • Export the data in the Redshift tables to CSV files on S3.
  • Convert the CSV files to Parquet format.
  • Moving the files to appropriate move path, so that we can support incremental exports.
  • Create the spectrum table in Redshift. Don’t need to create the table again and again. Only do it once.
  • Truncate the Redshift table if required.

What about Incremental Offload?




Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Playful Math

Introducing Decentr Auto Swap — Hermes

Test, Measure, Learn: What Agile Testing Metrics Can Teach You About Your Testing Process

Implementing Foldable, Map and FlatMap

Implementing HTTP from socket

How to Catch/Handle Exceptions Globally in Android

Error image

Obedience to Acapella, KöRN/4

Serving Machine Learning models in Google Cloud

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Deep Shah

Deep Shah

More from Medium

Strong Read on Master-Slave MySQL Setup — Part 2

Transform Incidents into Learning Opportunities with Blameless Postmortems

AWS SAM(Serverless Application Model)is an open source framework that enables AWS users to build…