Data Warehouse And Its Applications In Agriculture
Data Warehouse And Its Applications In Agriculture
DATA WAREHOUSE AND ITS APPLICATIONS IN AGRICULTURE
K.P.Wagh Dr. Satish R. Kolhe
Assistant Professor Reader
Gf’s GCOE Jalgaon NMU Jalgaon
Kishorwagh2000@yahoo.com srkolhe2000@gmail.com
A Data warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queries over data that originally came from different sources. In other words Data warehouse is a database that is used to hold data for reporting and analysis.
Economic foundation and productivity growth depends on agricultural sectors. Agriculture is the driving force behind the way of live and source of earnings for the majority of peoples. More than 60 percents of population are living in rural areas and the majority are farmers. The rural communities as a main producer for country food productivity and food security earn only 11 percents of Gross Domestic Product (GDP). The arrival of information age guides this country to new development strategies.
National Electronics and Computer Technology Center (NECTEC) in collaboration with the Ministry of Agriculture, has launched “Agriculture Information Network” as a response to the unmet information requirements of the agricultural sector. Farmers should gain benefit from the contents provided which include risk assessment, agriculture warning system and agricultural knowledge base, which aim to improve technology, productivity, income and stability of India agriculture sector through the age of Information Technology. The data warehouse consists of common databases and geo-spatial databases from various departments and organizations in the country and abroad. Farmers can get access to the contents through Internet by themselves or from groups of professional people called “Information Brokers”.
Keywords: Data Warehouse, Agriculture, IT
1. Introduction
A Data warehouse [1] is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queries over data that originally came from different sources. In other words Data warehouse is a database that is used to hold data for reporting and analysis.
Goals of Data Warehousing
To facilitate reporting as well as analysis Maintain an organizations historical information Be an adaptive and resilient source of information Be the foundation for decision making
Data Warehouse Architecture
Data warehouse Architecture comprises of
Operational source systems A data staging area One or more conformed data marts A data warehouse database
Operational Source Systems
Operational source systems [1] are developed to capture and process original business transactions. These systems are designed for data entry, not for reporting, but it is from here the data in data warehouse gets populated.
Data Staging Area
Data staging area is where the raw operational data is extracted, cleaned, transformed and combined so that it can be reported on and queried by users. This area lies between the operational source systems and the user database and is typically not accessible to users.
Data staging is a major process that includes the following sub procedures:
Extraction
The extract step is the first step of getting data into the data warehouse environment. Extracting means reading and understanding the source data, and copying the pas that are needed to the data staging for further work.
Transformation
Once the data is extracted into the data staging area, there are many transformation steps, including
1. Cleaning the data by correcting misspellings, resolving domain conflicts, dealing with missing data elements, and parsing into standard formats.
2. Purging selected fields from the legacy data that are not useful for data warehouse.
3. Combining data sources by matching exactly on key values or by performing fuzzy matches on non-key attributes.
4. Creating surrogate keys for each dimension record in order to avoid dependency on legacy defined keys, where the surrogate key generation process enforces referential integrity between the dimension tables and fact tables.
5. Building the aggregates for boosting the performance of common queries.
Loading and indexing
At the end of transformation process, the data is in the form of load record images. Loading in the data warehouse environment usually takes the form of replicating the dimensional tables and fact tables and presenting these tables to bulk loading facilitates each recipient data mart. Bulk loading is a very important capability that is to be contrasted with record-at-a time loading, which is far slower. The target data mart must then index the newly arrived data for query performance.
Data Mart
Data mart is a logical subset of an enterprise-wide data warehouse. For example, a data warehouse for a retail chain is constructed incrementally from individual, conformed data marts dealing with separate subject areas such as product sales. Dimensional data marts are organized by subject area such as sales, finance, and marketing and coordinated by data category such as customer, product, and location. These flexible information stores allows data structures to respond to business changes-product line additions, new staff responsibilities, mergers, consolidations, and acquisitions.
Data Warehouse Database
A data warehouse database contains the data that is organized and stored specifically for direct user queries and reports. It differs from an OLTP database in the sense that it is
designed primarily for reads not writes. An OLAP application is a system designed for few but complex (read only) request. An OLTP application is a system designed for many but simple concurrent (and updating) requests.
Metadata
Metadata defines the content and location of the data in the data warehouse, relationships between the operational databases and the data warehouse and the business views of the data in the data in the warehouse as accessible to the end-user tools. Metadata is searched by user to find the subject areas and the definitions of the data.
For decision support, the pointers required to data warehouse are provided by the metadata. Therefore, it acts as logical link between the decision support system application and the data warehouse. Thus, any data warehouse design should assure that there is a mechanism that populates and maintains the metadata repository and that all access paths to data warehouse have metadata as an entry point. In other words there should be no direct access permitted to the data-warehouse data if it does the user metadata definitions to gain the access. Meta data definition can be done by the user in any given data warehousing environment. The software environment as decided by the software tools used will provide a facility for metadata definition in a metadata repository.
OLAP Vs OLTP
OLTP (Online Transactional Processing)
OLTP servers handle mission-critical production data accessed through simple queries Usually handles queries of an automated nature OLTP applications consist of a large number of relatively simple transactions. Most often contains data organised on the basis of logical relations between normalised tables
• OLAP (Online Analytical Processing)
OLAP servers handle management-critical data accessed through an iterative analytical investigation Usually handles queries of an ad-hoc nature supports more complex and demanding transactions contains logically organised data in multiple dimensions
2. Warehouse Schema Design
Dimensional modeling is a term used to refer a set of data modeling techniques that have
gained popularity and acceptance for data warehouse implementation. Dimensional modeling is one of the key
Incoming search terms:
- articles on agriculture data warehouse for india
- data warehouse in agriculture
- Dealer Agricultural Articles @gmail com OR @yahoo com OR @hotmail com -scam -fraud