Ads

December 16, 2006

Data Warehouse Adds Power to Data Mining Tools

-- By Pushpa Sathish, Staff Writer

This is business intelligence at its best – software that helps you read your customers’ minds. Teradata Warehouse Miner 5.0 claims to be able to do just that. The database tool from the data warehouse provider leverages the most advanced analytic skills when used in conjunction with data-mining solutions from SAS, SPSS, Fair Isaac Model Builder and KXEN, to provide a peek into the future buying behavior of customers. According to Randy Lea, VP of products and services at Teradata, the combined solution facilitates faster model development, with runtimes becoming 25 times faster.

With the latest version of the software, improvement is promised in operations like data profiling, Analytic Data Set (ADS) generation, Predictive Model Markup Language (PMML), and model management. Follow this link for more information.

September 30, 2006

Teradata Upgrades Data Warehouse

The fourth quarter this year will see the release of Teradata’s upgraded data warehouse. Version 8.2 will feature enhanced support for real-time intelligence in the form of quick and predictable high-frequency queries from front-end customer service applications. The Linux version of the database that runs on the 64-bit SUSE Linux Enterprise from Novell is now available. Database Trends and Applications reports:

The “active data warehouse” from Teradata can now create partitions in join indexes, acting as "alphabetical tabs" that allow the database to only look in relevant sections, and avoid the need to scan tend of millions of rows in a join index.

August 04, 2006

Oracle Ships Data Warehousing Solutions

Data warehousing solution providers can now take advantage of Oracle’s two Oracle Warehouse Builder 10g database tools that were released earlier this week, to manage data and metadata lifecycles. Release 2 is a database design and Extraction, Transformation and Load (ETL) tool. Customers can also buy extra ETL and Data Quality options at $10,000 per CPU or $200 per user and $15,000 per CPU or $300 per user respectively. Connectors for PeopleSoft Enterprise, Oracle E-Business Suite and SAP come at $20,000 per connector per target application. The ETL options can be used in heterogeneous environments while the data quality options can be used for data profiling, cleansing, auto-correction, and auditing. Biz Intelligence Pipeline reports:

Oracle's new tool also reflects a trend in which the worlds of data warehouses, traditionally used for historical data, and operational databases, typically used for everyday tasks, are becoming more closely linked.

March 24, 2006

Preserving For Posterity

What with the rapid changes taking place in today's world, preserving important documents and photographs for posterity is the only way future generations will be able to gain a glimpse into the past. Which is why document digitisation company iArchives has chosen network-attached storage (NAS) solutions provider Exanet to entrust with the preservation of its historical documents and pictures. IArchives is now using the ExaStore system from NAS to archive all files related to newspapers, libraries, universities, and law companies. Data Warehouse IT Toolbox reports:

The 2-node ExaStore system supports 75 iArchives processing nodes, enabling it to handle the necessary simultaneous connections and allowing iArchives to point all of its processing nodes at the system.

New Database Modeling Tool

There's a new database-modeling tool available in the European market from the Czech Republic-based company Charomware. The latest release of CASE Studio 2, version 2.23 has been built to support the newest trends in database modeling and development. The tool packs in features that support PostgreSQL 8.1 and Advantage Database Server 8, besides being highly flexible and facilitating error fixing. It can also generate SQL scripts and detailed reports, automatically verify work, support more than 30 database systems, and customize using a template editor. The tool renders the process of database modelling simple, effective, and quick. A trial version is available as a free download on the company's website, for those who wish to sample a bite before buying the whole enchilada. If you wish to gather more information about CASE Studio 2, you can do so by following this link.

March 17, 2006

IBM Simplifies Data Warehousing

The release of IBM's DB2 Data Warehousing Edition version 9.1 marks the second phase the company's strategy to decrease the complications associated with data warehouses. While IBM introduced the balanced configuration unit (BCU) in the first strand, the second phase integrates various existing and new tools into one bundle. DB2 has two editions __ Base and Enterprise. The Base Edition will include Cube Views, Data Modeling, OLAP Modeling, and the Integrated Installer, while the Enterprise Edition will have all the features in the base edition along with the Data Partitioning feature, Query Patroller, Intelligent Miner, the web-based Admin Console Alphablox, the Design Studio (with the Data Flow and Data Mining Editors), and the SQL Warehousing Tool.

March 10, 2006

Integrated Data Warehouse From IBM

The latest version of IBM's DB2 Data Warehouse Edition comes with completely integrated data mining and OLAP tools. Version 9.1 incorporates DB2 Alphablox tools that help in building custom applications with embedded analytics, WebSphere application server, Rational development tools, and other software into the DB2 Universal Database. Biz Intelligence Pipeline reports:

In the past, DB2 Data Warehouse Edition was sold as separate components that customers assembled themselves, says Karen Parrish, VP of business intelligence at IBM.

Integrating Business and Service Data

All organizations exist for the dual purpose of providing either a service or selling a product, and most important, making profits. In a move designed to achieve that, Evanston Northwestern Healthcare (ENH), a US integrated healthcare system affiliated to the Northwestern University, has deployed a data warehouse solution from Informatica Corp. PowerCenter Advanced Edition will enhance the efficiency of the company's operations, step up corporate performance, and improve the way in which patients are cared for. Data Warehouse News reports:

ENH reportedly plans to implement PowerCenter as the foundation of a new initiative __ an enterprise-wide financial and clinical data warehouse that will be built by integrating medical and business data across the numerous departments of the organization.

February 21, 2006

Teradata Data Warehouse Heads East

ABN AMRO, the international banking services provider, will implement a data warehouse platform from Teradata, provider of enterprise analytic technologies and services, to support business development for its consumer businesses in Asia. The regional data warehouse (RDW) will first be rolled out at the bank's Taiwan branches, followed by those in Hong Kong, Southeast Asia, and China, to analyze customer revenue, monitor credit risk metrics, and handle customer relationship management (CRM). Data Warehouse Knowledge Base reports:

"ABM AMRO is deeply rooted in Asian financial markets," said Jim Brown, head of the Asia Consumer Client Segment of ABN AMRO. "To better serve our customers and fulfill the needs of the company's marketing management and risk control, we need a robust decision support platform. After a thorough vendor evaluation process, we decided to work with Teradata to deploy our data warehouse and CRM solution."

February 19, 2006

Enterprise Information Integration

Enterprise Information Integration (EII) is being touted as the solution to the ills __ regulatory compliance, real-time business intelligence (BI), and the daunting task of converging structured and unstructured information __ plaguing the business world. EII is defined as the integration of data from multiple systems into a unified, consistent and accurate representation geared toward the viewing and manipulation of the data. It integrates the information assets of an enterprise by providing access to diverse information sources from a variety of disparate siloes of information.

Probably this definition of EII reminds you of traditional information integration techniques like ETL-oriented data warehousing and customer data integration. The difference lies in the fact that EII accesses, instead of moving the information. EII provides a consolidated view of data through virtualization techniques that hide the combined query processing system that pulls data from various sources, while ETL moves data to data repositories and data marts.  EII focuses on less data movement and transformation while combining disparate definitions of data elements using strong global query optimization. EII is not a replacement for data warehousing; it complements the technology by bringing in data from minor or non-standard sources, and presenting it to the client on demand.

Other pivotal elements of EII include metadata management and robust data modeling. Metadata supports data reusability by creating and maintaining the logic and interfaces needed to preserve virtual views of customers and products. EII tools help maintain and enhance security of metadata and data in diverse sources. The technology is vital in providing an integrated platform that blends data access standards with data about the sources and the information needs.

In a nutshell, EII aims at providing a unified, on-demand view of data by creating access to multiple and different sources of data securely and efficiently.

HP Tests DB Archival Waters

Hewlett-Packard (HP) is spreading out into the database archiving market with the impending purchase of database archiving specialist Outerbay Technologies. The computer manufacturer is establishing its acquisition on the belief that database volumes will grow rapidly. Computer Wire reports:

OuterBay's software prunes databases of infrequently accessed data, moving older data to lower storage tiers in order to maximize application performance and reduce tier one capacity usage. Financial details of the purchase by HP were not revealed.

February 15, 2006

Healthcare Provider Deploys Data Warehouse Solution

Providence Health & Services (PHS) has deployed an enterprise data warehouse solution based on the Ensemble universal integration platform from InterSystems Corporation. PHS, which operates acute care facilities, freestanding long-term care facilities, and low income and assisted living facilities across five states in the Pacific Northwest region, is using Ensemble to integrate information from 12 systems including 30 data repositories. Database Trends and Applications reports:

The integration project highlights the growing trend of leveraging a wide range of data sources to enable optimal healthcare decisions. More than 400,000 encounter records will be streaming into the PHS warehouse annually. In addition, more than 1.5 million encounter records will flow in from more than 150 physicians' locations and clinics.

Information Lifecycle Management

With the need to store virtually every item of data nearly forever, and the rising costs of data storage and maintenance, Information lifecycle management (ILC) is becoming an increased concern for organizations today. ILM is the process by which data is classified and sorted according to it worth to the business. The data so classified is then stored, managed and protected according to its importance.

Though ILM brings with it the benefits reduced storage costs, maximized utilization, minimized redundancy, and security in regulatory compliance situations, the process is often ambiguous and difficult to implement. This is because the definition of the value of data is a complex process, as data keeps changing with time. The criticality of data, its frequency of access, and the length of time it needs to be stored are factors that contribute to the calculation of its value.

ILM is a continuous process that can only be improved, never complete, according to Cliff Dutton, executive vice president and chief technology officer at Ibis Consulting, Inc., a provider of electronic discovery and compliance solutions. The concept of cataloguing and classifying huge volumes of data is too daunting a task to attempt, so organizations are better off identifying the subset of data that is critical to business processes and has certain meaning.

Though technologies like HSM software, content management systems, virtualization tools, e-mail/database archiving products, backup software, SATA arrays and content-addressable storage arrays help in the discovery, classification, storage, archival, and automatic movement of data, the ILM decision should focus on the business policies of the organization rather than the ILM infrastructure.

The most time-consuming and laborious part of the ILM strategy is the definition of the business requirements for data storage and protection, says Dutton. By delineating the goal of the organization in implementing ILM, be it disaster recovery, business continuity, regulatory compliance, or meeting service level agreements with external or internal clients, a decision can be reached on how much money should be spent.

Today, the term ILM is being used more to describe the implementation of good management practices to reduce costs, says Jim Damoulakis, CTO at Glasshouse Technologies.

February 12, 2006

Repositories for Metadata

To understand data and work efficiently with it, one has to grasp the idea of metadata. To put it simply, metadata is data about data, its context and semantics. Though the concept of a repository to store metadata sounds useful, it is theoretically hard to achieve. The metadata storage houses available today are complex, expensive and do not integrate well with modern tools as they tend to follow very formal processes and methods.

However, a few vendors like Xcalia, BEA Systems, Informatica and IBM are working on easy to implement, cost-effective metadata repositories. Xcalia is implementing an XML table-based metadatabase in its Intermediation Platform, which allows the creation of metadata-based transformation rules that allow services and data sources to interact consistently, in parallel with the data's context and semantics, while Informatica is using a metadata repository in its PowerCenter Data Federation data-integration platform.

February 08, 2006

Storage Certification Program

IT managers and database administrators looking to put together high performance storage networks that can speed up I/O-intensive databases, online transaction processes (OLTP), online analytical processes, modeling, content streaming, and high volume data acquisition environments, will benefit from Texas Memory Systems' storage certification program __ Fast Access Storage Tested (FAST). Customers who would like to avoid buying various small products from multiple vendors to build a comprehensive solution will now be able to purchase third-party products bundled together with solid state products from Texas Memory Systems. Database trends and Applications reports:

Texas Memory has tested 4-gigabit host bus adapter and switches certified them "FAST" under the program. All FAST certified products passed Texas Memory Systems interoperability tests and demonstrated performance or features critical to Texas Memory Systems' solid state disk customers, according to the company.

Free Data Server Download from IBM

IBM's DB2 Universal Database Express-C (DB2 Express-C) is available as a free download for those looking for a flexible and easy to deploy data server. DB2 Express-C allows any number of users and does not limit the size of the database. Database Trends and Applications reports:

DB2 Express-C offers the same core DB2 data server in a smaller package specifically designed for use in software development, deployment, redistribution and embedding within applications. No charge community support for DB2 Express-C is available via a new public forum on developerWorks, IBM's resource for developers, with optional for-fee support offered by IBM.

February 07, 2006

Storage Needs of an Organization

The storage strategy of an organization is determined based on certain criteria. Business criteria include assessing the growth stage your company is in, and the backup and restore storage requirements. Technology criteria encompass the types and growth rates of data, and the operating systems and network technology. Lifecycle criteria include the costs associated with scaling, total cost of ownership, administration, maintenance, and hardware and software.

OLAP Survey 5

The results of OLAP Survey 5 conducted by Nigel Pendse and Survey.com provide a detailed view into BI implementations and customer experiences with BI products. The annually conducted survey covers a select group of perceived equal BI application providers known as the "Peer Group". The fifth survey "Peer Group" included the products MicroStrategy, Hyperion Essbase, Cognos PowerPlay, Business Objects, and Oracle Discoverer.

The survey revealed that all the products differed significantly in the number of users or data volumes supported. Most users of OLAP were interested in expanding the use of their solutions. Organizations that conducted a formal evaluation of various products before deciding on the right solution for their processes achieved more success from their OLAP product. Customers were divided over the loyalty issue; while some were happy to stay with their existing provider, others were considering jumping ship to the competition.

The survey also concluded that clear trends have emerged in key areas like customer loyalty, product support quality, data volume, web deployment rate, prevalence rate, and number of seats purchased and deployed.

Data Warehouse Modes

A data warehouse can be designed and structured in three different ways:

1. ROLAP or Relational online analytical processing: Data in this mode is stored in relational databases.

2. MOLAP or Multidimensional online analytical processing: This is the traditionally used mode in OLAP analysis in which data is stored in the form of multidimensional cubes.

3. HOLAP or Hybrid online analytical processing: This mode combines the best features of both the ROLAP and MOLAP modes. MOLAP is used to provide summaries, while ROLAP is used to dig deep into the database for details.

February 05, 2006

The Value of Datamarts

You have probably come across the term data warehouse, but have you heard the expression "datamart" used in conjunction with the former? Both terms define the storage of data, but on different levels. While a data warehouse is concerned with storage details that focus on the organization of data, a datamart involves the way data is displayed and presented.

A datamart is defined as a logically related subset of data extracted from the complete data warehouse, meaning that the subset of data is related to a single business process or a group of related business processes. Usually, data meeting one or more criteria is extracted to form a datamart, and many datamarts can be used to extract data from one central data warehouse. Here, the focus is on providing customers ease of use, up-to-date and quick reporting capabilities, and effortless mining of sensitive data.

Datamarts are advantageous because they can be designed and built separately from the data warehouse, by just following the underlying architecture of the data warehouse. Marts that are built asynchronously can be used in conjunction with each other. This provides customers a simpler way of working only with data that is related to their processes, rather than being concerned with the complexity of the entire data warehouse.

Development teams also find datamarts useful in designing and maintaining customer applications, as the entire data warehouse design is broken down into simple, uncomplicated structures.

Data inside a datamart can be aggregated, summarized, and averaged according to the specific needs of businesses. Reporting is enhanced while using datamarts since smaller queries performing on a small subset of data are easier to process.

Energy Saving Trends

Of late, the spotlight has shifted to energy saving and pollution control measures to reduce environmental degradation and the greenhouse effect. A two-day conference organized by Sun Microsystems and other large technology firms in Santa Clara, to thrash out the finer points of energy efficiency and conservation in data centers, highlighted this issue.

An average-sized data center, measuring 50,000 square feet, consumes the power needed to light up 2,500 houses. The power is used to run and cool the numerous computer servers housed in a typical data center.

With companies expanding by the day, more employees are added, which in turn translates to more computer systems and more cooling units. The cooling units guzzle the same amount of energy required to keep the systems running. With uninterrupted power supplies for data centers running all day long, it is no wonder that some companies spend up to $2 million towards their data center costs.

"If we can reduce the amount of power data centers consume, we could probably reduce the number of blackouts and slow down the need for new power plants," said Noah Horowitz, a senior scientist at the Natural Resources Defense Council in San Francisco. He concurs with Sun's proposition that the US Environmental Protection Agency (EPA) should develop an Energy Star program that can be used to measure the performance of network servers by a particular metric yardstick, like on a per-watt basis. The challenge lies in providing metrics for the whole data center, he adds.

The Energy Star program is a voluntary label for all consumer appliances to certify that they use less energy and reduce greenhouse gas emissions. Though the program was first applied to certain computers and monitors when announced in 1992, it was extended to include consumer appliances as well.

Sun, on its part, is implementing new low power consumption chips and servers on its systems.

February 01, 2006

Compliance Rules Affect Data Centers

A survey conducted by Unisphere Media has revealed that most companies had effected changes in their data center processes to comply with new regulatory requirements. Nearly 70% of the respondents of the study, ‘Compliance Management: All Roads Lead to the Data Center’, said they had implemented additional security measures like encryption and additional layers of approvals and documentation. Database Trends and Applications reports:

The study was sponsored by SHARE, the premier IBM user group, in cooperation with its 2005 Alliance Vendors - American Power Conversion, Computer Associates, EMC Corporation, Innovation Data Processing, Isogon: An IBM Company, ISPW BenchMark Technologies Ltd., Luminex Software, Inc., Mainstar Software Corporation, and Siemon.

January 31, 2006

IBM Releases UIMA Source Code

The open source community is set to benefit from IBM’s Unstructured Information Management Architecture (UIMA) technology. The firm released the source code of UIMA to encourage innovation and allow analytics software tools from multiple sources to work together and build on each other, according to Nelson Mattos, Vice-President of Information and Interaction at IBM Research. UIMA employs a Java-based technology that allows users to analyze information in unstructured file formats like documents, images, comments, email messages and multimedia files. Searches can be performed on concepts and related topics rather than on keywords alone. Data Warehouse reports:

UIMA was first unveiled in 2004 and was developed by the Defense Advanced Research Projects Agency and IBM. The UIMA code is governed by the Common Public Licence, an official licence approved by the Open Source Initiative, and is available on the SourceForge website.

January 29, 2006

Why Data Warehouses?

Most organizations use a data warehousing strategy to integrate their data. A data warehouse offers the following benefits:

  • Data inconsistencies arising from data stored in various systems are eliminated.
  • Queries for the reporting process are expedited without slowing down the operational procedures.
  • Data quality is improved, leading to better decision-making processes.
  • Analytics are easily performed since a data warehouse supports a large storage area.
  • Security practices are better with the infrastructure of a data warehouse.
  • Data can be integrated, filtered and aggregated from various external sources, to provide a single consolidated view for BI users within the organization.

Read more on data integration.

January 28, 2006

Everlasting Digital Archives

With the need to store large amounts of data beyond its active useful period in order to comply with government regulations, most organizations are in a quandary. They are put in a position where they need to build archives for their mounting data. Storage Networking defines an archive as a repository for organizational records that are no longer in use but may need to be accessed again. Physical storage devices take up too much space, and present a problem when records need to be accessed in quick time.

The best bet for firms is a digital archive, which uses digital data formats to store vast amounts of data and requires no physical space. The downside is that digital formats tend to change with amazing speed, which renders old records unreadable. Digital signatures are being used to overcome this obstacle. Also, data is stored randomly in zeroes and ones which are naturally unstable, and has no direct access method; computers are needed to decode them. To surmount this hurdle, multiple copies are preserved in groups of networked systems.

A modern archive has to be portable or software/hardware-agnostic. By hardware-agnostic, we mean that the software needed to run the innermost preservation layer must be hardware-independent and portable. Software-agnostic refers to the aspect that digital records must outlast the applications that created them.

A good digital archive should have gateways for the movement of data and metadata. It should also be relatively inexpensive when one considers the colossal amount of data that needs to be archived. Almost 80 percent of archived data is past its active life. So the main details that an organization seeking to build a competent digital archive needs to be concerned with are how the data gets into the archive (the process of ingestion), the methods used to access the data when the need arises, and the ways and means of preserving the stored data. 

January 21, 2006

To Delete or Not – That is the Question

Data deletion issues have become more complicated for companies with new data retention laws coming into effect every year. With acts like Sarbanes-Oxley and HIPAA delineating rules for preservation of data, firms find themselves facing the mammoth task of deciding on which data items to delete and which to retain.

There appears to be a simple rule to follow __ delete data which does not have any legal implications. But there is no way of foretelling if the data deleted today will be legally valuable at some point in the future. So the easiest decision is to save everything, the downside of which is the slowdown of performance and the pileup of massive amounts of data.

Archiving may be the answer for a few companies like Saint-Gobain Crystals of Newbury, Ohio. The firm follows an archiving plan using BrightStor ArcServe Backup from Computer Associates (CA) and backs up more than 3TB of data every day. To improve data access, the company maintains a separate archiving environment. Infrequently accessed files are pushed to the bottom of the storage stack.

Firms that take the archiving route should be prepared to face the huge costs involved. According to industry estimates from Horison Information Strategies, compliance could account for as much as 5 percent of a typical IT budget, i.e., companies spent as much as $15.5 billion in 2005 on compliance.

A few pass off the responsibility for storage on to the end user. Archived data beyond a certain limit is automatically burned on a CD and sent to the end user.

At the other end of the spectrum, Fortune 100 firms, large retail chains and telecom industries are still cleaning out e-mail and workstations every two months. While larger companies may be able to afford complying with data retention laws, for the majority, the issue boils down to a trade-off between the risk of deletion and the price of compliance.

January 19, 2006

Oracle Releases Vulnerability Patches

Under its Critical Patch Update program, Oracle Corporation has released patches for over 80 vulnerabilities in its database and application server software, and its collaboration and e-business suites, as part of its scheduled quarterly update. Chief among the flaws that affected Oracle databases is the weakness that gives administrator privileges to any user with basic access rights, and also allows prospective attackers to prevent illegal operations from being recorded by the database server's built-in auditing mechanism. The next update is slated for April 12. Computer World reports:

Oracle has said that it will release highly integrated patches that combine fixes for multiple high-priority vulnerabilities. The patches are cumulative, which means that users who miss applying patches one quarter can apply a cumulative update the following quarter to address both the previous problems and any new ones that might have cropped up.

January 09, 2006

Shanghai Stock Exchange Chooses Teradata for Data Warehouse Project

The Shanghai Stock Exchange (SSE) has chosen Teradata, a division of NCR Corp. (NYSE NCR) for the second phase of its enterprise data warehouse project, "Dimension Data Store", which began in 2004. SSE has successfully completed the first phase of its operational data store consolidation in concert with Teradata, and has moved on to build an enterprise-wide decision-support system in the next phase.

The bourse's ultimate goal is to utilize the Teradata Warehouse platform to develop more products, reduce operational risks, improve monitoring capability and provide additional information services, said Bai Shuo, CTO of SSE. The project is expected to be completed this year, and will provide a single view of the enterprise, advanced metadata management and application, a complete data warehouse management architecture, and structured and non-structured data application.

January 07, 2006

Data Warehouses - Managing Information

In today's competitive global business environment, it is crucial for organizations to understand and manage information for making timely decisions and respond to changing trends. Data Warehouses have been at the forefront of information technology applications to effectively use digital information for business planning and decision-making. They are computer-based information systems that hold data that originate from either another application or from an external system or source. Data from various online transaction processing applications is selectively extracted and organized on the data warehouse. Data Warehouse can boost operating efficiency, lower costs, and bring companies closer to their customers. The scope may be as broad as to accumulate information for the entire enterprise or as narrow as a personal Data Warehouse for a single manager. Developing a good Data Warehouse requires careful planning, requirements definition, design, prototyping and implementation. Financial/Data Manager of Student Life, a tri-weekly college newspaper published by Washington University Student Media, says

The availability of data warehousing will allow users to easily perform many types of data analysis (which are not easily done right now) that are necessary to make decision in their positions whether they are managers or cabinet members

Read here how Data Warehousing solutions helped the Oregon Employment Department

July 24, 2005

Information gathering during the post holiday season

Retailers need to be extra careful during the holiday season when a bulk of their sales take place. Since, a business may have multiple interfaces with the customer, for ex, catalogs, physical stores, online shopping, door to door salesmen, etc, it becomes important to maintain co-ordination between them. Each point of contact gives the retailers information about their customers in terms of thier family profile, their buying habits, etc. The retailers can use this information to their advantage by targetting the customers in a better manner, imporving their products and services, their supply chain, and their marketing efforts. The important thing is to ensure that information derived from the data is distributed back to the different contact points. If this is not done, one contact point may start to encroach upon the business of the other sources. Companies provide maximum facilities to their customers, including a buy anywhere and return anywhere policy. This can sometimes overburden the staff and lead to human errors. Businessintelligence reports:

These details may seem trivial on the surface. If margins are high, many companies may not see the urgency of keeping their customer data clean and up to date.

Read More: Avoiding the Post Holiday Blues: Profiting From Returns

July 01, 2005

Sun Microsystems to concentrate on open-source databases

Sun Microsystems does not intend to develop large transactional databases, on the lines of Oracle, IBM, and Microsoft. Sun is to acquire the enterprise integration company, SeeBeyond Technology. It is hoping that this will help it to address the data integration needs of its customers. The sales force of the acquired company will become a part of the service oriented architecture of Sun and will be integrated with its existing sales team. Sun will be introducing a sixth suite in its Java Enterprise System platform, the Sun Java System Integration Suite. Sun has said that it will be particularly careful about the manner in which it will handle the integration of its future acquisitions. Sun also has plans to improve its systems management and the plans to outsource Solaris are a part of the plan. It should lead to the introduction of better systems management tools. Eweek reports:

I feel the pressure, as I have the grid guys on one side asking us to do it and threatening to do it themselves if we don't, and on the other side I have the market pointing out the other players and what they are doing on that front and the need for interoperability

Read More: Sun Shuns Big Database, Embraces Open Source

June 24, 2005

Data warehousing to support efficient decision making

The system development life cycle works on the assumption that the requirements that the warehousing tool will fulfil are known before you start the designing of the tool. DSS analysts, on the contrary, often learn about their expectations from a warehousing system after they have analysed data. This approach is contrary to what is required for a robust data warehousing model. Dimensional models of these warehousing systems consider the business vocabulary and the business processes in order to create a schema that is based on them. Thus, dimensional models concentrated on gathering analytical requirements instead of the data requirements. The success of the dimensional models depends on creating a protype of the design and getting the DSS analysts to work with the designs that had actual data. This helped the DSS analysts to obtain a clearer picture of his requirements. Businessintelligence reports:

The pattern of behavior that drove the dimensional modeling community to a recognition of the need for rapid-prototype-and-iterate cycles was, by the end of the 1990s, quite widely reported, and cross-cultural.

Read More: Understanding The Data Warehouse Lifecycle Model

June 17, 2005

Speed up the data mining and analytics

Given the multiplicity of data sources and the vast amount of data that can be collected, organizations can get overwhelmed with information. Moreover, inspite of having the information they may not be able to leverage any competitive advantage for lack of timely data analysis. Jit Saxena, CEO Netezza Corporation, feels that the solution to the problem lies in the merging of the database, storage and the analytics. The solution he provided was the Netezza Performance Server System (NPS). It allows corporates to mine data minutely and analyse it in real time. The NPS gives a cost effective and a better performing option than the conventional data storage servers in which the server, database software, and the storage are separate and the convergence of these applications is often the responsibility of the customer. NPS is an appliance that facilitates the running of BI applications such as BusinessObjects next to it, just a simple plug-in is required. Dmreview reports:

The major issue is that as data size increases, query performance and load performance decrease, and cost of ownership increases because the organizations must continuously tune their system to somehow try to maintain their previous levels of performance.

Read More: Challenging the Status Quo

June 16, 2005

Faulty data across databases can make it difficult for people

Government records, insurance databases, etc are examples of huge databases having information on individuals. Inaccuracies in personal identity information can lead to the cancellation of your social security, loan refusal, and insurance and credit card applications turned down. Extraction, transformation, and loading is one technique followed to rectify data. This is of particular significance with reference to the merging of technologies, for example SAP with Microsoft and SAP with Macromedia. The present generation tools however cannot ensure the accuracy of data presented, an answer to this may lie in visualization technology. The Federal Information Management Security Act requires that government agents be graded according to their ability to safeguard data. This implies that data originating at government sources must be factually correct and properly encrypted. It is also up to the individual to ensure that he manages his information properly, does not leave behind information online when he is surfing, etc. Eweek.com reports:

ChoicePoint doesn't take responsibility for aggregating and propagating filthy data. ChoicePoint says it's the data sources—RMVs, court, etc.—that are responsible for the data. If it's from the government, it must be good stuff, the thinking goes.

Read More: Garbage In, Garbage Out of Control