SQL Server training :- When do we apply de-normalization ?(Interview question)
Articles,  Blog

SQL Server training :- When do we apply de-normalization ?(Interview question)


In the previous part of the video we saw the benefits of normalization we also saw how normalization help’s you to avoid redundant data and error data enter into your systems now i am sure that somewhere in your mind the three normal forms have must becomes golden rules for database designing but that hang one there is other side of the story as well we need to understand before we go and apply this normalization as golden rules for database designing in order to understand the other part of the story let us first understand the larger classification of IT systems from the data perspective from the data perspective we can classify IT system into two be classifications one is OLTP and the other one is OLAP so OLTP stands for online transaction processing system while OLAP stands for online analytical processing system now OLTP system are those systems which deal with regular transaction in other words your normal insert update delete those kind of queries are more suited for OLTAP systems and what it also means is that all this operations that is insert update and delete either bringing data into the system into your application or they removed data from your application so what it means it that when the data is coming inside your application you would like to avoid redundant data you would like avoid any kind of errorneous data or you would like to avoid data which does not have integrity which we discussed when we discussing the normalization rules so in other words for OLTAP system definitely normalization that is applying first normal form second normal form third normal form is the right way to go ahead why because we don’t want to get data into the system which is erroneous which is duplicate which is redundant and etc.. but now let’s considers some difference operations which are not normal a daily transaction operations for examples a manager wants to take large number of historical data back dated historical data and he want’s to go and analyze that data he want’s go and do some forecasting on that data he wan’t to do some kind of a sales management using that data now these kind of operations here the data is not coming into the system but there is a heavy operation which is done which is from the prospective of selecting the data from the prospective of messaging the data from the prospective of creating complicated report’s on it so these kind of systems are called as OLTP system that is online analytical processing systems so if you try to visualize the over all motive of OLTP and OLAP systems the main goal of OLTP system that we would like to have faster insert rapid update rapid deletes you would like to improve the data quality by reducing redundancy etc.. while in OLAP system you would like to do faster analyses you would like to search faster so in other words the main goal of OLTP system is more of data manipulation while the main goal of OLAP system is faster search and faster analysis now let me make a statement here normalization is not suited for OLAP system WHY so what we do normalization as we discuss in previous video all about breaking data into granular pieces ensuring that we don’t have redundant data so what’s happens is we end up with lot of tables now when we say that in OLAP systems our main motive is to do faster search because search has to go and pull data from multiple tables the search start becoming slow so in OLAP systems rather then doing normalization we do the vice-versa of it that is we de-normalization so let me just demonstration why normalize design is not suited for OLAP systems so you can see over here i have very normalized design here so you can see that this is a perfectly normalized database design where i have customers table which has a contact name of the customer there is one more table here called as a orders table which tells me that how much orders these customer have taken ok and second each one of these orders i have detail more final detailing it saying that what kind of what kind of product he has bought how much quantity it is and what is the price of each one of those product so now let’s say that you one to go and drive a very simple report here with say’s that ok give me the contact name of the customer give me the unit price give me the quantity and the total sales so what i will do is we already have all the fields but we don’t have the total sales of total sales we need to multiply the unite price with the quantity so i will just say here comma and i will multiply this unit price with the quantity so this will give out the total sales to us so if i do execute again you can now see that we can see there is contact name there is unite price quantity and the sales is multiplication of the both of them now let me just take this Sql here Ctrl+C and let me open up a new query window here and let me just go and see the statistics of this Sql so what i am going to do is i am going to on my statistics here in order to go and on your statistics you have to say set statistics io on and let me say execute now once i do execute right you can see that it has gone ahead it has executed the Sql that is fine but let’s go the messages now this is very interesting here now in the messages you can see that in order to get this data right he is making inner join with the orders table he is making inner join with a customers table he using the order details tables and you can see that there is lot’s of logical reds as well happening over here now if you think about OLAP systems where the search has to be faster reading from so many tables is definitely not efficient in other words it will definitely slow down your select query so how about rather then keeping this data into all of these tables how about taking all this data and combining into one table some de normalize table so what will happen is rather then going and making select into all of these table we can just go and make a select into that one big de normalize table and we can improve our efficiency so what i have done here is you can see that i have created one more table here which is called as the customer de-normalize table so if you go to the design here i have created one big table what it does is it just take the customer name and the total sales so what will now happen is that if i go and just select this table now ok and if you see this table already all the import has been done into this table all the back hand activity of calculating multiplying everything has been done and the hole aggregation is now stored into this table so if i see the efficiency of this select now first thing is we have got all the three tables we have only one table now so definitely rides all efficient so if you go and on your statistics over here set statistics io on execute this you can see now he is only fetching from one table so if i take this one let me just CTRL+A Ctrl+C notepad so this thing is coming from denormalized and this thing is coming from Normalized so you can now see from this both statistics right the first thing is the number of table is normalized is higher so definitely the select is slower and if you see the denormalized the select will be faster why because you are only fetching from one table so numbers of table at decrease second if you also look at the scan count i meant to say scan count here is 91 + 1 + 1 whatever it is but here the scan count is only one look at the logical reads at one an the logical reads is 184 and adding to in other 11 and 5 while here the logical reads is only 9 so in other words denormalization is all about combining the data into one big table rather then going and fetching data from multiple tables so summarizing what exactly is denormalization, denormalization is absolute opposite of normalization in other words denormalization we can have redundant data we can have duplicate data we can have aggregate data we will actually violet the three normal forms why because our main intention there is two make our search faster when do we use denormalization denormalization is very much useful for OLAP Systems where faster query where faster analyzing is more important rather then insert update and deletes where do we use normalization for OLTP systems where our main goal is to do insert update and delete and we want quality data to enter into our system now set and don it does not mean that when you want faster searches you start applying denormalization in other words it does not mean now here there are four tables let make it one table let’s make this 10 tables one table that is not the way again in denormalization we have twp great techniques here which we can apply and make our OLAP Systems much better one is the snow Flake design and the other one is the Star Design so what will do is in the next part of the video will try to understand both of these denormalization techniques which can help us to make our OLAP Systems much better So i hope that you enjoyed this video in the next video the next video will understand both of these denormalization techniques Now here is a small favor you can do for us or i will say it’s a small request from us if you think that whatever we are doing here on this channel is cool it is nice it will help out people what you can do is you can go ahead and share this video either on your facebook account on your twitter account on your Orkut on blogger whichever channel here associated with please do go ahead and talk about this video by doing this small favor you are helping us to know that what activity we are doing is it worth for the community or not So go ahead and if you like this video share it on facebook account, Twitter, blogger, My Space, Orkut, Google Plus whatever it is and let the world know that hers is a resource of videos which DOT NET Developers can see and they can learn from it. THANK YOU SO MUCH 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *