
The trend is to move all the tools that once dominated the desktop of our computers to the web and our mobile devices.
Today our web browser and smartphone has become an indispensable tool both in our leisure time and when doing our work, since the majority of desktop applications that used to communicate, store photographs, account for our time, Make presentations, and so on; Today they have their equivalent in app or web.
Apps, Software-as-a-Service (SaaS), and cloud services today dominate the marketplace and this has resulted in hundreds of successful startups that have created applications that serve millions of users. What was previously run in a closed server environment with 10-10,000 users, today runs on the web with several million users.
"What was previously executed in a closed environment of servers with 10-10,000 users, today runs on the web with several million users"
One of the key elements of these systems are the database management systems (DBMS) that we have squeezed to the maximum and have had to overcome the challenges of scalability and concurrency that these new business models entail.
The relational database model that Codd postulated in 1970 is still valid today and relational DBMSs such as MySQL or Oracle have been optimized to the maximum and still dominate the market today. Even Twitter with its millions of users continues to use MySQL, that is, with several layers of middlewares that have developed by hand and that allows them to model databases oriented to distributed graphs and saving part of the data in a non-SQL system, specifically in Cassandra.
When they have been pushed to the limit, just as it happened to the Twiter developers, engineers have been forced to use subterfuges as middlewares, intermediate caches, work outside of 3FN by duplicating some data and unintuitive solutions that complicate the maintenance of Software and contribute to its degradation.
When they have been pushed to the limit, just as it happened to the Twiter developers, engineers have been forced to use subterfuges as middlewares, intermediate caches, work outside of 3FN by duplicating some data and unintuitive solutions that complicate the maintenance of Software and contribute to its degradation.
So ... is there an alternative? How do we deal with this type of development?
Nowadays the vast majority of people who are dedicated to the web have heard of non-relational database systems, either by that name or Not Only SQL (noSQL) systems. If it does not sound to you maybe you've heard of MongoDB, one of the noSQL systems that has more impact.
However, given that during our university training we have been directed to use highly structured data, with normal forms and strict rules to ensure the consistency of data ... many developers have a hard time opting for these types of systems that can, however, Make a difference in certain projects. Non-SQL systems are easily scalable, much more versatile and of course fast. However, 'Magic always comes with a price', and not all are advantages. In return for this scalability and versatility we have to give up some features of relational systems.
In the following graph we can see how some systems are classified. On the one hand we have Memcached, which having virtually no functionalities can easily scale. To the right side of the graph we have the relational systems, which offer a great functionality at the cost of losing speed and scalability. Between half of both we would have nonSQL systems that giving up some functionalities of relational systems rapidly escalate placed on the ordinate axis.
Therefore , before starting a development the question is ...
... Can we give up these characteristics?
To which?
In the following graph we can see how some systems are classified. On the one hand we have Memcached, which having virtually no functionalities can easily scale. To the right side of the graph we have the relational systems, which offer a great functionality at the cost of losing speed and scalability. Between half of both we would have nonSQL systems that giving up some functionalities of relational systems rapidly escalate placed on the ordinate axis.
Therefore , before starting a development the question is ...
... Can we give up these characteristics?
To which?
Scalability? Yes please!
There are two ways to scale a system, what we call vertical scalability and horizontal scalability:
- Vertical scalability: It consists of improving the capabilities of the equipment for example by increasing the RAM, with faster data storage devices and with more and better processors. Obviously this has a limit and comes to a point where it is necessary to redesign the system.
- Horizontal scalability: It consists of adding nodes to the system so that the requests are distributed between the different nodes. Although the applications need to be prepared for this, this scaling is much more advantageous in terms of cost and future possibilities. In principle there is no limit since we can add as many nodes as we want and it allows us to start with a minimum hardware that we can improve progressively without having to make migrations of the system.
An example where the benefits of horizontal scalability are very clear would be in systems with seasonal traffic with important peaks such as the sale of tickets for a concert; This configuration would allow us the opening day of the sale, scale the system to withstand the initial avalanche of purchases and then reduce it progressively depending on the demand to reduce server costs.
If the system we are developing can be serviced by single database server and is not expected to grow massively in the future, my recommendation is clear: SQL systems have many more features and benefits than noSQL. In cases where it is expected that our system can grow exponentially, scalability plays a crucial role because it is necessary that our system can be dimensioned according to the demand and number of simultaneous users, ideally, without making modifications in the underlying software. In these cases it is worth studying non-relational systems before undertaking the project.
If the system we are developing can be serviced by single database server and is not expected to grow massively in the future, my recommendation is clear: SQL systems have many more features and benefits than noSQL
As already mentioned, to achieve this scalability, noSQL systems have to sacrifice some features. If we recall Brewer's theorem, he asserted that any distributed system could only fulfill 2 of the following 3 characteristics:
- Consistency: The information will always remain consistent and consistent when performing operations on the data and will be the same regardless of the node from which we receive the response.
- Availability: All the information stored in the system will always be available even if some of the nodes are not accessible.
- Partition tolerance: The system will continue to function even if part of it is no longer accessible and some nodes are out of the network.
Relational databases meet the consistency and availability characteristics. However, in order for non-relational systems to achieve the required scalability, they can not dispense with partition tolerance, which has resulted in two streams: Those who decide to give up consistency such as Cassandra or CouchDB and those who sacrifice Availability as MongoDB, Paxos or Redis. In the first case, it will be us as developers that we have to manage the possible inconsistencies that may occur in the data, which complicates the development in the second, in the case of failure of some node, some of the data may not be accessible And we must prepare our systems so that they can continue to function without such data.
It will be our responsibility as developers to study in each project the best solution, valuing the importance of each characteristic to study which we can sacrifice.
Consistency, JOIN queries and transactions
In ideal relationships systems the maximum is to store the data in a single place without duplication, for which we take the data schema to the third normal form. For example if we want to store a list of subscribers to our blog along with their interests, we will have a table of subscribers, another of interests and we will relate both through their unique identifiers. To obtain the complete information we will cross the information of both tables with a JOIN query that will return the information to us as if it were a single table.
This simple example does not have many problems, but some JOIN queries in complex databases can be too complicated and inefficient. This is where the flexibility of the NoSQL systems comes in. As the storage space is no longer a problem, why not have all the information about a record (or document) stored in a single table ( Or collection)? Thus, when we need the information, simply extract it without having to cross and mix different tables.
It makes sense, however, returning to the previous example, when we have to update the information of an interest, we will have to act on all the records that include that interest. What happens if an error occurs in the middle of the update query execution?NoSQL does not take very well with transactions , which is normal if we think that a query can be acting on several servers simultaneously and if it fails in one of them the others have no way of knowing which ones have been successfully completed and which ones have not. In these cases it will be our responsibility to restore the consistency of the data by implementing some rollback mechanism ... and at this point it is when little by little the efficiency and versatility that we have achieved thanks to the use of noSQL systems begins to lose bellows. Can we really write more efficient code for a project with a limited budget than that offered by relational systems that have been improving for more than 30 years?
This simple example does not have many problems, but some JOIN queries in complex databases can be too complicated and inefficient. This is where the flexibility of the NoSQL systems comes in. As the storage space is no longer a problem, why not have all the information about a record (or document) stored in a single table ( Or collection)? Thus, when we need the information, simply extract it without having to cross and mix different tables.
It makes sense, however, returning to the previous example, when we have to update the information of an interest, we will have to act on all the records that include that interest. What happens if an error occurs in the middle of the update query execution?NoSQL does not take very well with transactions , which is normal if we think that a query can be acting on several servers simultaneously and if it fails in one of them the others have no way of knowing which ones have been successfully completed and which ones have not. In these cases it will be our responsibility to restore the consistency of the data by implementing some rollback mechanism ... and at this point it is when little by little the efficiency and versatility that we have achieved thanks to the use of noSQL systems begins to lose bellows. Can we really write more efficient code for a project with a limited budget than that offered by relational systems that have been improving for more than 30 years?
Flexibility vs. Structured Information
Another characteristic of non-SQL systems is the absence of a data schema which can be considered an advantage or inconvenience according to the point of view. It is undoubtedly much more flexible, and makes sense if we think that information is usually never as structured as relationships systems impose. Imagine that we have developed a system that has been running for a few months and now we realize that we need to add an address field to a table. No problem, since there is no schema, simply the previous records will not include that information but we can add it to the new records.
However, such flexibility can turn into a time bomb in projects carried out by large or changing teams. For example, if returning to the example of the address some developers may decide to store it as a string of text and others as an object that separates the street, the city and the zip code. This, in the long run, can be a problem and require spending large amounts of time on code refactoring.
However, such flexibility can turn into a time bomb in projects carried out by large or changing teams. For example, if returning to the example of the address some developers may decide to store it as a string of text and others as an object that separates the street, the city and the zip code. This, in the long run, can be a problem and require spending large amounts of time on code refactoring.
Is it time for noSQL?
Many alternatives currently exist in the market with different levels of maturity, even companies like Oracle have launched their own NoSQL solution, but none can compete with the robustness and support that exists for relational systems. This may be decisive for some companies as the specifications of most non-available databases are currently changing by introducing significant changes from one version to another making it difficult to offer future compatibility for our developments.
As we have been exploring in this article, the noSQL systems are not the panacea, although it is true that they present some characteristics that can be very attractive for certain projects, but not for that reason we must throw ourselves to use them in massive form in all our developments. It is our responsibility as engineers to determine when they can be beneficial or when we can give up the extra functionalities offered by relational databases.
I finish the post opening the debate ...
What do you think about noSQL systems?
Have you ever used any of your projects?
Do you think the disadvantages they present with respect to relational systems are salvageable?
As we have been exploring in this article, the noSQL systems are not the panacea, although it is true that they present some characteristics that can be very attractive for certain projects, but not for that reason we must throw ourselves to use them in massive form in all our developments. It is our responsibility as engineers to determine when they can be beneficial or when we can give up the extra functionalities offered by relational databases.
I finish the post opening the debate ...
What do you think about noSQL systems?
Have you ever used any of your projects?
Do you think the disadvantages they present with respect to relational systems are salvageable?
ReplyDeleteSimply want to say your work is outstanding. The clarity in your post is simply excellent and i can assume you’re an expert on this subject.. Thanks a million and please carry on the rewarding work. I found some good websites to best essay writing service
Great post. I used to be checking constantly this weblog and I am inspired! Extremely useful information specially the ultimate part :) I maintain such info much. I used to be seeking this particular information for a very lengthy time.
ReplyDeleteThank yyou and good luck.
website design pakistan
responsive web design
website design services Pakistan
responsive web design services in Karachi
UI UX website design Pakistan
wireframe design in Karachi
Best website Design services in Pakistan
Pakistan Best web design service
I wanted to thank you for this great read!! I definitely enjoying every little bit of it I have you bookmarked to check out new stuff you post. trafficize bonus
ReplyDeleteI was reading some of your content on this website and I conceive this internet site is really informative ! Keep on putting up. agence de pub Strasbourg
ReplyDeleteI wanted to thank you for this excellent read!! I definitely loved every little bit of it. I have you bookmarked your site to check out the new stuff you post. To-do Task List Web App
ReplyDeleteDesigning websites today is relatively easy compared to 5 or 10 years ago, you have content managed websites such as WordPress these can be built by anyone with an intermediate knowledge of computers, you do not have to have any website design or html knowledge. Elementor Experts
ReplyDeleteI really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. Tire Repair Olathe
ReplyDelete