<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The GARA Systems Blog &#187; Data Management</title>
	<atom:link href="http://www.gara.com/blog/category/data-management/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.gara.com/blog</link>
	<description>The Online Technology Resource for Small Business</description>
	<lastBuildDate>Wed, 24 Aug 2011 15:19:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Anatomy of Data Warehouse</title>
		<link>http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/</link>
		<comments>http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 16:38:32 +0000</pubDate>
		<dc:creator>Gary Keorkunian</dc:creator>
				<category><![CDATA[Data Management]]></category>

		<guid isPermaLink="false">http://www.garasystems.com/blog/?p=308</guid>
		<description><![CDATA[When I bring up the concept of a data warehouse to my clients I often get the same few questions. What is a data warehouse? Why do I want or need a data warehouse? What&#8217;s involved in creating a data warehouse? In this post I will attempt to answer each of these questions. What is [...]]]></description>
			<content:encoded><![CDATA[<p>When I bring up the concept of a data warehouse to my clients I often  get the same few questions.</p>
<ul>
<li>What is a data warehouse?</li>
<li>Why do I want or need a data warehouse?</li>
<li>What&#8217;s involved in creating a data warehouse?<span id="more-308"></span></li>
</ul>
<p>In this post I will attempt to answer each of these questions.</p>
<p><strong>What is a data warehouse?</strong></p>
<p>First off, do not be overwhelmed by the term &#8220;Data Warehouse&#8221;. Data  Warehouse is just a fancy name for a database with a special purpose.   Like other databases, data warehouses are usually setup in Relational  Database Management Systems (RDBMS) such as MySQL, MS SQL Server, Oracle  or other comparable systems.  And like other databases, data warehouses  are basically made up of tables, queries and other objects with  properties regarding their structure, access rules, etc.</p>
<p>There are two key differences between data warehouses and operational  databases, however.  First, data warehouses are typically implemented  to store a more comprehensive set of your organization&#8217;s data, both in  terms of operational scope and history.  They do this to provide you  with a single repository of all operational and historical data.  This  is the special purpose of a data warehouse.</p>
<p>The second key difference is data warehouse designs are optimized to  support reporting and data analysis requirements.  Data needs to be  retrieved from the system quickly.  Operational databases on the other  hand are usually optimized to support transaction processing  requirements.  Data needs to be added and updated quickly.</p>
<p><strong>Why do I want or need a data warehouse?</strong></p>
<p>There are a number of reasons why an organization would implement a  data warehouse.</p>
<p>The first reason is to consolidate organizational data into one  database to support comprehensive reporting and data analysis.  Many  organizations use more than one operational system to maintain their  data.  There are accounting systems, point-of-sale systems, CRM systems,  time-clocks, etc, each of which store information in their own way.   Within the data stores of these systems is a wealth of information.   Unfortunately, many organizations find if very difficult and often  impractical to create reports that span these sources in any meaningful  way. The data is simply too segregated.   By consolidating this  information into a data warehouse reporting and data analysis becomes a  much easier task.  Decision makers and others can be given access to  tools that let them see the big picture and drill down on the detail.</p>
<p>The second reason is to maintain a more comprehensive history of  information.  Some operational systems, in order to maintain adequate  performance, need to archive or purge their data after a while.  I am  currently working with a client who&#8217;s point of sale system purges all  transactions after 3 months. Because they create 25,000 invoices per  month, this makes sense.  Allowing that data to accumulate could have  adverse effects on performance and  cause lag time at the point-of-sale.   That would be a customer service no-no, so the point-of-sale system is  kept lean and mean.   Of course, purging the older data means  historical information is lost.  By implementing a data warehouse, we  are able to preserve and query this historical data indefinitely without  degrading the performance of operational systems.</p>
<p>A third reason is to act as a master for data synchronization tools.   Data often overlaps in operational systems.  Employee information, for  example, can be stored in point-of-sale systems, time-clock systems and  more.  A data warehouse can be a useful tool that helps keep this  information in sync.  Good syncing tools saves your staff time and help  reduce the chance for keying errors.  In a similar way, a data warehouse  can be used to support other system integration functions like moving  transactions from one system to another.</p>
<p>A data warehouse, when implemented properly, is a tool that supports  decision making, improves the integrity of your data, and maximizes your  staff&#8217;s productivity.</p>
<p><strong>What&#8217;s involved in creating a data warehouse?</strong></p>
<p>Creating a data warehouse is a process that does require good  technical skills and experience in data architecture and software  engineering. Nevertheless, the overall process can be understood by  most.</p>
<p>The first step in creating a data warehouse is to <strong>implement a  platform</strong> that will support it.  This means you will need a server  machine running an RDBMS package.  Many factors go into selecting the  appropriate machine.  These include such things as the number of users,  the frequency of data migrations, the complexity of reports and data  analysis as well as any other applications the server must support.  It  is important to use some of the proven methods for identifying the  hardware requirements of a proposed application server.</p>
<p>As for the RDBMS, I have had great success with MySQL and because  it&#8217;s open source, the only expense associated with it is the time it  takes to set up and administer it.  Of course, data warehouses can be  implemented with any number of database products.  If you already run a  specific RDBMS package it will probably be cost effective to use it for  your data warehouse as well.  It is important to evaluate any proposed  package against your specific requirements.</p>
<p>The second step is to <strong>design the data model</strong>.  This involves  identifying all of the information that the warehouse must maintain.   This is usually done by examining what information is available in your  operational systems and identifying specific reports that will be  required.  From there we create a model &#8211; essentially a list of tables &#8211;  that will hold this data and support the reporting requirements.  Once  the model is developed it must be implemented on the server.  This is  done either by writing scripts or using tools provided by the RDBMS  vendor.</p>
<p>The third step is to <strong>create the data migration tools</strong> that will  copy data from your operational systems into the data warehouse.  In  some cases the tools will be scripts or programs that access data using  standard interfaces such as ODBC or COM, or a proprietary software  development kit (SDK) provided by the vendor.</p>
<p>Other scripts will import data from files in either proprietary  formats or common formats like CSV.  This will be the case when  operational systems don&#8217;t offer interfaces and only support export  functions or we are receiving files from vendors, customers, field  offices or other places.</p>
<p>Many RDBMS tools, including MySQL, have wizards available for  creating migration scripts that support common data source formats.   This can help speed up the process of creating these tools.</p>
<p>In addition, any number of software developments tools can be used to  create the necessary programs.</p>
<p>The fourth step is to <strong>setup a migration schedule</strong>.  To be most  useful to decision makers the data warehouse must be updated on a  regular basis.  Operational data changes constantly and the data  warehouse needs to keep pace.</p>
<p>In cases where operational data can be accessed directly, without  user intervention, we simply setup the migration tools to run at  predetermined times using a task scheduler.</p>
<p>For the cases where data is imported from files, we need to create  tools that allow users to initiate the migration by using a File | Open  style dialog or by simply uploading, copying or otherwise placing the  file into a specified location.</p>
<p>Deciding on the specific migration frequency is simply a balancing  act between the need for real time reporting and the load each data  migration puts on the data server and other operational systems.</p>
<p>Last but certainly not least, we need to <strong>create reports and other  data analysis tools</strong> that support decision making and other  organizational objectives.  With a properly designed and implemented  data warehouse report writing becomes a breeze.  Because the warehouse  is built using a SQL-based RDBMS, reports and analysis tools can be  developed easily using any number of available tools.</p>
<p>So there you have data warehouses in a nutshell.  They help you make  more informed decisions, preserve a comprehensive history of your  organizations data and improve information management processes.  And  with the assistance of an experienced data and software consultant  building one is relatively straight forward process.</p>
<p><a title="Contact GARA" href="http://www.gara.com/contact/">Contact me</a> if you would like to learn more  about how to build your data warehouse.</p>
<div class='bookmarkify'><a name='bookmarkify'></a><div class='title' title='Use these links to share this page with others'>Bookmark and Share</div><div class='linkbuttons'><a href='http://del.icio.us/post?url=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;title=Anatomy of Data Warehouse' title='Save to del.icio.us' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/delicious.png' style='width:16px; height:16px;' alt='[del.icio.us] ' /></a> <a href='http://digg.com/submit?phase=2&amp;url=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;title=Anatomy of Data Warehouse' title='Digg It!' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/digg.png' style='width:16px; height:16px;' alt='[Digg] ' /></a> <a href='http://www.facebook.com/share.php?u=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/' title='Save to Facebook' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/facebook.png' style='width:16px; height:16px;' alt='[Facebook] ' /></a> <a href='http://www.google.com/bookmarks/mark?op=edit&amp;output=popup&amp;bkmk=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;title=Anatomy of Data Warehouse' title='Save to Google Bookmarks' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/google.png' style='width:16px; height:16px;' alt='[Google] ' /></a> <a href='http://www.myspace.com/Modules/PostTo/Pages/?c=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;t=Anatomy of Data Warehouse' title='Save to MySpace' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/myspace.png' style='width:16px; height:16px;' alt='[MySpace] ' /></a> <a href='http://www.stumbleupon.com/submit?url=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;title=Anatomy of Data Warehouse' title='Stumble It!' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/stumbleupon.png' style='width:16px; height:16px;' alt='[StumbleUpon] ' /></a> <a href='http://twitter.com/home/?status=Anatomy of Data Warehouse+http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/' title='Save to Twitter' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/twitter.png' style='width:16px; height:16px;' alt='[Twitter] ' /></a> <a href='https://favorites.live.com/quickadd.aspx?mkt=en-us&amp;url=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;title=Anatomy of Data Warehouse' title='Save to Windows Live' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/windowslive.png' style='width:16px; height:16px;' alt='[Windows Live] ' /></a> <a href='http://bookmarks.yahoo.com/toolbar/savebm?opener=tb&amp;u=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;t=Anatomy of Data Warehouse' title='Save to Yahoo! Bookmarks' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/yahoo.png' style='width:16px; height:16px;' alt='[Yahoo!] ' /></a> <a href='http://www.feedburner.com/fb/a/emailFlare?itemTitle=Anatomy of Data Warehouse&amp;uri=http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/&amp;loc=en_US' title='Email this to a friend' onclick='target="_blank";' rel='nofollow'><img src='http://www.gara.com/blog/wp-content/plugins/bookmarkify/email.png' style='width:16px; height:16px;' alt='[Email] ' /></a>  <a title='See more bookmark and sharing options...' href='http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/#bookmarkify' rel='nofollow'><small>More&nbsp;&raquo;</small></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://www.gara.com/blog/2010/02/08/anatomy-of-data-warehouse/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

