Data can be skewed by its nature. How can you tell if you have skewed data in your database? And if you do, how do you handle it when writing queries? Mitigating skewed data is hard and not obvious. There is no magic formula, just a few tricks that have to be justified on a case-by-case basis. Broadly speaking, you can try to work around skewed data two ways. Adjust the data that you have in the software or look for hardware solutions.
In this whitepaper, Joe Celko shares a few tricks to consider using on a case-by-case basis. The whitepaper discusses descriptive statistics, how does a database become skewed, the data is not on a valid scale, data can just be skewed, bell curve versus Tracy-Widom distribution, data can be normal but filtered, how to handle the data, adjust the data, change your scales, and physically partitioning the disk.
Presenter: Joe Celko
Joe Celko is the author of a series of ten books on SQL and RDBMS (MKP/Elsevier) that have been in print for over 20 years. He served for 10 years on the ANSI/ISO database standards committee. He has written columns and articles for the IT trade press for over 30 years. He currently enjoys being a TEALS volunteer and judging the local High School Science Fest once a year.