当前位置:首 页 > Essay代写 > 查看文章

数据挖掘英文论文data mining essay

Essay代写, 留学生论文 你是第17668个围观者 0条评论 供稿者:

Microsoft Time Series

The Microsoft Time Series algorithm creates models that can be used to predict continuous variables over time from both OLAP and relational data sources. For example, you can use the Microsoft Time Series algorithm to predict sales and profits based on the historical data in a cube.

Using the algorithm, you can choose one or more variables to predict, but they must be continuous. You can have only one case series for each model. The case series identifies the location in a series, such as the date when looking at sales over a length of several months or years.

A case may contain a set of variables (for example, sales at different stores). The Microsoft Time Series algorithm can use cross-variable correlations in its predictions. For example, prior sales at one store may be useful in predicting current sales at another store.

Microsoft Neural Network

In Microsoft SQL Server 2005 Analysis Services, the Microsoft Neural Network algorithm creates classification and regression mining models by constructing a multilayer perceptron network of neurons. Similar to the Microsoft Decision Trees algorithm provider, given each state of the predictable attribute, the algorithm calculates probabilities for each possible state of the input attribute. The algorithm provider processes the entire set of cases , iteratively comparing the predicted classification of the cases with the known actual classification of the cases. The errors from the initial classification of the first iteration of the entire set of cases is fed back into the network, and used to modify the network’s performance for the next iteration, and so on. You can later use these probabilities to predict an outcome of the predicted attribute, based on the input attributes. One of the primary differences between this algorithm and the Microsoft Decision Trees algorithm, however, is that its learning process is to optimize network parameters toward minimizing the error while the Microsoft Decision Trees algorithm splits rules in order to maximize information gain. The algorithm supports the prediction of both discrete and continuous attributes.

Microsoft Linear Regression

The Microsoft Linear Regression algorithm  is a particular configuration of the Microsoft Decision Trees algorithm, obtained by disabling splits (the whole regression formula is built in a single root node). The algorithm supports the prediction of continuous attributes.

Microsoft Logistic Regression

The Microsoft Logistic Regression algorithm is a particular configuration of the Microsoft Neural Network algorithm, obtained by eliminating the hidden layer. The algorithm supports the prediction of both discrete andcontinuous attributes.) 


摘要:微软 SQL Server 2005中提供用于创建和使用数据挖掘模型的集成环境的工作。本教程使用的四种情况:有针对性的邮件预测;顺序分析和聚类;演示如何使用挖掘模型算法;挖掘模型查看器和数据挖掘工具。


数据挖掘教程旨在通过创建走在Microsoft SQL Server 2005的数据挖掘模型的过程。数据挖掘算法,并在SQL Server 2005工具可以很容易地建立一个项目,包括市场购物篮分析各种全面的解决方案,预测分析,有针对性的邮件分析。这些解决方案的情景更详细的解释在后面的教程。

SQL Server 2005最明显的部分是用来创建和处理数据挖掘模型的工作室。在线分析处理( OLAP )和数据挖掘工具被统一为两个工作环境:商业智能开发工作室和SQL Server 管理工作室。通过商业智能开发工作室,您可以在与服务器断开连接的情况下建立一个服务项目分析。当项目已经准备就绪,您可以发布到服务器上。您也可以直接面向服务器工作。SQL Server 管理工作室的主要职能是管理服务器。之后将有针对每一个环境的详细说明。欲了解更多关于从两个环境中选择的信息,请参看SQL Server联机丛书中的“在SQL Server 工作室和商业智能开发工作室中选择”。


当你创建一个挖掘模型,你会想要去探索它,寻找有趣的模式和规则。在编辑器中的每个挖掘模型查看器是自定义进行探讨,以特定的算法建立的模型。如需观众的信息,请参看SQL Server联机丛书中的“查看数据挖掘模型”。


为了建立数据预期,你将使用一种 DME语言,DMX扩展了传统的SQL语法,包含了一些创建修改和建立数据预期的命令,关于DMX的详细信息,请参考SQL BOL中的 “Data Mining Extensions (DMX) Reference”章节。因为建立一个数据预期可能比较复杂,所以数据挖掘编辑器包含了一个工具叫做 “Prediction Query Builder”, 该工具可以让你在一个图形化的界面下编辑DMX查询语句,你也可以在该工具中可以查看自动生成的DMX语句。


一些很重要的建立数据挖掘解决方案的步骤是用来整理准备那些用于建立数据模型的数据,SQL2005包含一个DTS的工作环境以及一些DTS的工具用于清理验证准备数据,关于DTS的更多信息请查看SQL BOL中的‘DTS Data Mining Tasks and Transformations’ 章节。 

分页 1 2 3 4 5 6 7

—— admin


大眼 可爱 大笑 坏笑 害羞 发怒 折磨 快哭了 大哭 白眼 晕 流汗 困 腼腆 惊讶 憨笑 色 得意 骷髅 囧 睡觉 眨眼 亲亲 疑问 闭嘴 难过 淡定 抗议 鄙视 猪头
footer logo