Choosing a database for your organization can be a daunting task given how many options are available. Not only this, the pros and cons of each one has to be weighed up and the chances are no single one will do everything you want.
The common denominator for businesses is to achieve fast and reliable scalability, possess a flexible system their developers can use because of friendly APIs (application programmer interfaces) and powerful performance with the capacity to manage multiple processors, servers, bandwidths etc.
The best approach to take is to identify what your needs, wants and goals are. Important questions to ask include:
- How often does the data need modifying or adding?
- Who will be the database administrator?
- Will they provide IT support the database?
- Will they maintain the data?
- Can the data be accessed via the internet is it safe?
- What support does the database hardware come with is there additional cost?
There are two core type of database: the relational database management system which includes Oracle, MySQL, MS Server, PostgreSQL, SQLite and the non-relational database which is a NoSQL document store including MongoDB, Couchbase as examples.
Relational (SQL) databases
Are based on relational algebra store data in data sets known as relations comprising of rows and columns where information is stored as a value in a specific cell. SQL is the language used to access a relational database and provides standardized predictability and utility. SQL databases excel in high structured data situations where internal processes are automated and data integrity is essential in the case of finance, patient data etc
Pros
- Data can be stored and retrieved easily using SQL and as such scalability is quick data can be added without affecting what is already there
- They handle highly structured data and support ACID (Atomicity, Consistency, Isolation, and Durability) transactions
- They are limited to what users can access and modify so they are good for applications that need tiered access
Cons
- They do not handle unstructured data very well at all and they do not like change
- Costly to install and even harder to scale. Horizontal scaling where more servers are added is faster and less costly than vertical scaling which means adding extra resources to the server
- Relational databases confuse this because they need Sharding or partitioning and distribution of horizontal data across machines to scale
- Sharding and ACID don’t mix very well
Non-relational (NoSQL) databases
Use document stores and exhibit a flexible schema unlike SQL which are much more rigid and stringent in the criteria for inserting. They do not use tables, rows or columns and are composed of key-value stores, column stores, graph stores and document stores. NoSQL databases excel for in depth data analysis and content management systems.
Pros
- Document stores are flexible and are equally at home handling partially-structured and unstructured data.
- Data type is not required at the outset so users who are unsure of what the incoming data will look like are better suited to this type of database
- Horizontal scaling is easy and sharding works better in this type of database enabling fast scaling to occur
- Modifications can occur without affecting all documents and causing downtime
Cons
- ACID is sacrificed for flexibility
- Querying cannot occur across documents
- Complex relationships reduce performance and scalability
Other types of non-relational databases
Key-value store (Redis)
A key-value store is a type of non-relational database where each value is associated with a specific key or a unique identifier associated with that value. Values don’t require predefined schema and can come in various forms such as: JSON, XML, HTML, PHP, numbers, strings, counters, videos, images. Key-value store is also known as an associative array. Key-value stores are great for user profiles and settings, or data that is accessed often but not updated or for use in blogs comments and online reviews.
Pros
- Flexibility is great and it can deal with a wide selection of data types
- It has high performance since keys are used to go straight to the value with no index searching or joins
- Key-value stores are portable and can be moved from one system to another without rewriting code
- They promote horizontal scalability
Cons
- Values are hard to query because they are not stored as define entities so reporting or editing values is tricky
- Not all objects are easy to model as key-value pairs
Wide-column store (Cassandra, HBase)
Also known as column stores or extensible record stores, wide-column stores are dynamic column-oriented non-relational databases. Instead of schema they use a keyspace to encompass column families which contain rows and columns. Timestamps also exit within the architecture. Wide-column stores are very good for data warehousing on big data analytics where speed is a factor.
Pros
- They are the best of both worlds of relational and non-relational databases for example this deals with both structured and partially-structured data better than other non-relational databases
- Updating is easy
- Horizontal scalability is high and scale rates are fast compared to relational databases
- Wide-column stores aggregate queries very well
Cons
- Wide-column stores are slower at handling transactions than relational databases
- Updating is easy in bulk but uploading and updating individual records is difficult
- They do not like ad-hoc query patterns, high level aggregations and changing database requirements