Following Episode 71 which was about DevOps for Big Data, Episode 72 focused on databases and it was great to be invited to take part in this one as well
It was great to be invited back onto the Continuous Discussions podcast for an episode about big data
When acquiring data for the data warehouse from source systems, it can be useful to make a clear distinction between the time at which an event occurred, and the time at which the event was recorded by the source system. In the simplest case, the source system records the event at the time it occurs and the anomalies described below do not happen. But in cases where there is a delay between the actual time of the event, and the time the record of the event is received by the source system, then there's a trap that needs to be avoided.
We are very pleased to announce that Cloud BI is now an AWS Consulting Partner
Large table rebuilds need to be handled by the build.
On Tuesday I participated in an online panel on the subject of Continuous Improvement, as part of Continuous Discussions (#c9d9), a series of community panels about Agile, Continuous Delivery and DevOps.
Is there a way to make use of the cost savings of a transient EMR cluster and still have the convenience of a long-running version?
This article about the recent S3 slowdown and recovery notes that AWS originally pursued the wrong root cause. There's always a risk of this happening. We discuss the benefits of the ability to revert changes here.
We found one particular type of data warehouse ELT logic test to provide particularly high benefits for very limited effort.
We created a mechanism that we called "The Federator" for making data processed on one Redshift cluster be available on other Redshift clusters. This post follows the introduction in the previous part 1 post, and describes how we solved the challenge of dealing with large data volumes.
We created a mechanism that we called "The Federator" for making data processed on one Redshift cluster be available on other Redshift clusters. This post introduces what we did.
How we built a solution that would keep Amazon Redshift in sync with SQL Server
Two contrasting process options for data warehouse deployment.
This is an overview of how we made the "button" that deploys code to a Redshift enterprise data warehouse.
We developed a mechanism we called "DbAlign" for doing incremental changes to our Redshift development, test and production databases.
This is an overview of the principles we followed when migrating an enterprise data warehouse to a cloud infrastructure with AWS Redshift. We added automated deployments and automated testing to the more traditional set.