Introduction

Dr Peadar Grant

Data Architecture

1 History

Good treatment given in:

2 Key questions

  1. What data storage solution(s) are the best fit for a given situation?
  2. How can we accomodate different usage patterns, such as historical analytics vs online business processes?
  3. How can we handle scaling in terms of data volume or usage?
  4. What availability requirements has our data, and how do we meet those?
  5. How should we integrate different systems to share and exchange data?
  6. Is access to the system centralised or distributed? Are we expected to accomodate disconnected / offline / asynchronous access?
  7. Why should we backup our data? How best to do this?
  8. How should we provision all or parts of our data architecture — on-device, on-site, co-located in a data centre or supplied from a Cloud provider?

In general, none of these questions has a “right” answer. There are many possible answers to each individual question for most usage cases.

3 Choosing data storage systems

The best fit for any particular situation will depend on a number of factors, some of them technical but others organisational and human:

4 Assumptions

During this course we will make a few assumptions:

5 Technologies

PostgreSQL
relational database management system
Redis
key-value store (and more)
Neo4J
graph database
MongoDB
document store
DynamoDB
single-table NoSQL database

References

   Thomas M Connolly and Carolyn E Begg. Database Systems: A Practical Approach to Design, Implementation and Management. Pearson, 6th edition, 2015.

   Luc Perkins, Eric Redmond, and Jim R Wilson. Seven Databases in Seven Weeks. The Pragmatic Bookshelf, 2018.