Data Warehouse And Its Applications In Agriculture

Data Warehouse And Its Applications In Agriculture

 

DATA WAREHOUSE AND ITS APPLICATIONS IN AGRICULTURE

K.P.Wagh                                                                               Dr. Satish R. Kolhe                

Assistant Professor                                                                 Reader                                    

Gf’s GCOE Jalgaon                                                                NMU Jalgaon                         

Kishorwagh2000@yahoo.com                                               srkolhe2000@gmail.com       

 

A Data warehouse is a repository of integrated information, available for queries and analysis.  Data and information are extracted from heterogeneous sources as they are generated.  This makes it much easier and more efficient to run queries over data that originally came from different sources. In other words Data warehouse is a database that is used to hold data for reporting and analysis. 

 Economic foundation and productivity growth depends on agricultural sectors. Agriculture is the driving force behind the way of live and source of earnings for the majority of peoples. More than 60 percents of population are living in rural areas and the majority are farmers. The rural communities as a main producer for country food productivity and food security earn only 11 percents of Gross Domestic Product (GDP). The arrival of information age guides this country to new development strategies.

National Electronics and Computer Technology Center (NECTEC) in collaboration with the Ministry of Agriculture, has launched “Agriculture Information Network” as a response to the unmet information requirements of the agricultural sector. Farmers should gain benefit from the contents provided which include risk assessment, agriculture warning system and agricultural knowledge base, which aim to improve technology, productivity, income and stability of India agriculture sector through the age of Information Technology. The data warehouse consists of common databases and geo-spatial databases from various departments and organizations in the country and abroad. Farmers can get access to the contents through Internet by themselves or from groups of professional people called “Information Brokers”.

 

Keywords: Data Warehouse, Agriculture, IT

 

 

1.    Introduction

A  Data  warehouse [1] is  a  repository  of  integrated  information,  available  for  queries  and analysis.  Data  and  information  are  extracted  from  heterogeneous  sources  as  they  are generated.  This  makes  it  much  easier  and  more  efficient  to  run  queries  over  data  that originally came from  different  sources.  In other words Data warehouse is a database that is used to hold data for reporting and analysis. 

  

Goals of Data Warehousing

To facilitate reporting as well as analysis Maintain an organizations historical information Be an adaptive and resilient source of information Be the foundation for decision making

  

Data Warehouse Architecture

Data warehouse Architecture comprises of

Operational source systems A data staging area One or more conformed data marts A data warehouse database

 

Operational Source Systems

Operational  source  systems [1]  are  developed  to  capture  and  process  original  business transactions.  These  systems  are  designed  for  data  entry,  not  for  reporting,  but  it  is  from here the data in data warehouse gets populated.

 

Data Staging Area

Data staging area  is where  the  raw operational  data is  extracted,  cleaned,  transformed and combined  so  that  it  can  be  reported  on  and  queried  by  users.  This area lies between the operational source systems and the user database and is typically not accessible to users.

 

Data staging is a major process that includes the following sub procedures:

Extraction

The extract  step  is  the  first  step  of  getting  data  into the  data  warehouse  environment. Extracting means reading and understanding the  source data,  and  copying  the pas  that are needed to the data staging for further work.

Transformation

Once  the  data  is  extracted  into  the  data  staging  area,  there  are  many  transformation steps, including

 

1.  Cleaning the data by correcting misspellings, resolving domain conflicts, dealing with         missing data elements, and parsing into standard formats.

2.  Purging selected fields from the legacy data that are not useful for data warehouse.

3.  Combining  data  sources  by  matching  exactly  on  key  values  or  by  performing  fuzzy    matches on non-key  attributes.

4.  Creating  surrogate  keys  for  each  dimension  record  in  order  to  avoid  dependency  on legacy  defined  keys,  where  the  surrogate  key  generation  process  enforces  referential integrity between the dimension tables and fact tables.

5.  Building the aggregates for boosting the performance of common queries.

Loading and indexing

At  the  end  of  transformation  process,  the  data  is  in  the  form  of  load  record  images. Loading  in  the  data  warehouse  environment  usually  takes  the  form  of  replicating  the dimensional  tables  and  fact  tables  and  presenting  these  tables  to  bulk  loading facilitates each  recipient  data mart.  Bulk  loading  is a very important  capability  that  is to  be  contrasted  with  record-at-a  time  loading,  which  is  far  slower.  The target data mart must then index the newly arrived data for query performance.

 

Data Mart

Data  mart  is  a  logical  subset  of  an  enterprise-wide  data  warehouse.  For example, a data warehouse for a retail chain is constructed incrementally from individual, conformed data marts dealing with separate subject areas such as product sales. Dimensional  data  marts  are  organized  by  subject  area  such  as  sales,  finance,  and  marketing  and  coordinated  by  data  category  such  as  customer,  product,  and  location. These  flexible  information  stores  allows  data  structures  to  respond  to  business  changes-product  line  additions,  new  staff  responsibilities,  mergers,  consolidations,  and acquisitions.

  

Data Warehouse Database

A data  warehouse database  contains  the  data  that  is  organized  and  stored  specifically  for direct  user  queries  and  reports.  It  differs  from  an  OLTP  database  in  the  sense  that  it  is

designed primarily for reads not writes. An  OLAP  application  is  a  system  designed  for  few  but  complex  (read  only)  request.  An OLTP  application  is  a  system  designed  for  many  but  simple  concurrent  (and  updating) requests.

 

Metadata

Metadata defines the content and location of the data in the data warehouse, relationships between the operational databases and the data warehouse and the business views of the data in the data in the warehouse as accessible to the end-user tools. Metadata is searched by user to find the subject areas and the definitions of the data.

For decision support, the pointers required to data warehouse are provided by the metadata. Therefore, it acts as logical link between the decision support system application and the data warehouse. Thus, any data warehouse design should assure that there is a mechanism that populates and maintains the metadata repository and that all access paths to data warehouse have metadata as an entry point. In other words there should be no direct access permitted to the data-warehouse data if it does the user metadata definitions to gain the access. Meta data definition can be done by the user in any given data warehousing environment. The software environment as decided by the software tools used will provide a facility for metadata definition in a metadata repository.

 

OLAP Vs OLTP

 

OLTP (Online Transactional Processing)

OLTP servers handle mission-critical production data accessed through simple queries Usually handles queries of an automated nature OLTP applications consist of a large number of relatively simple transactions. Most often contains data organised on the basis of logical relations between normalised tables

• OLAP (Online Analytical Processing)

OLAP servers handle management-critical data accessed through an iterative analytical investigation Usually handles queries of an ad-hoc nature supports more complex and demanding transactions contains logically organised data in multiple dimensions

 

2.    Warehouse Schema Design

Dimensional modeling is a term used to refer a set of data modeling techniques that have

gained popularity  and acceptance for  data  warehouse  implementation.  Dimensional modeling is one of the key

Incoming search terms:

Pages: 1 2 3

Leave a Reply

*

Comment moderation is enabled. Your comment may take some time to appear.