Learning From Data 2nd Ed (PDF 高清全文下载)

c#小王子 c#小王子 2021-04-09 1839 软件,编程,Python


Learning From Data 2nd Ed (PDF 高清全文下载)


There are two problems in modern science:


 too many people use different terminology to solve the same problems;

 even more people use the same terminology to address completely different issues.

Anonymous

In recent years, there has been an explosive growth of methods for learning (or estimating dependencies) from data. This is not surprising given the prolifera-tion of


 low-cost computers (for implementing such methods in software)

 low-cost sensors and database technology (for collecting and storing data)

 highly computer-literate application experts (who can pose ‘‘interesting’’ application problems)


A learning method is an algorithm (usually implemented in software) that esti-mates an unknown mapping (dependency) between a system’s inputs and outputs from the available data, namely from known (input, output) samples. Once such a dependency has been accurately estimated, it can be used for prediction of future system outputs from the known input values. This book provides a unifified descrip-tion of principles and methods for learning dependencies from data.


Methods for estimating dependencies from data have been traditionally explored in diverse fifields such as statistics (multivariate regression and classifification), engi-neering (pattern recognition), and computer science (artifificial intelligence, machine learning, and, more recently, data mining). Recent interest in learning from data has resulted in the development of biologically motivated methodologies, such as artifificial neural networks, fuzzy systems, and wavelets.


Unfortunately, developments in each fifield are seldom related to other fifields, despite the apparent commonality of issues and methods. The mere fact that hundreds of ‘‘new’’ methods are being proposed each year at various conferences and in numerous journals suggests a certain lack of understanding of the basic issues common to all such methods.


The premise of this book is that there are just a handful of important principles and issues in the fifield of learning dependencies from data. Any researcher or practitioner in this fifield needs to be aware of these issues in order to successfully apply a particular methodology, understand a method’s limitations, or develop new techniques.


This book is an attempt to present and discuss such issues and principles (common to all methods) and then describe representative popular methods originating from statistics, neural networks, and pattern recognition. Often methods developed in different fifields can be related to a common conceptual framework. This approach enables better understanding of a method’s properties, and it has methodological advantages over traditional ‘‘cookbook’’ descriptions of various learning algorithms.



Many aspects of learning methods can be addressed under a traditional statistical framework. At the same time, many popular learning algorithms and learning methodologies have been developed outside classical statistics. This happened for several reasons:


1. Traditionally, the statistician’s role has been to analyze the inferential limitations of the structural model constructed (proposed) by the application-domain expert. Consequently, the conceptual approach (adopted in statistics) is parameter estimation for model identifification. For many reallife problems that require flflexible estimation with fifinite samples, the statistical approach is fundamentally flflawed. As shown in this book, learning

with fifinite samples should be based on the framework known as risk minimization, rather than density estimation.


2. Statisticians have been late to recognize and appreciate the importance of computer-intensive approaches to data analysis. The growing use of computers has fundamentally changed the traditional boundaries between a statistician (data modeler) and a user (application expert). Nowadays, engineers and computer scientists successfully use sophisticated empirical datamodeling techniques (i.e., neural nets) to estimate complex nonlinear

dependencies from the data.


3. Statistics (being part of mathematics) has developed into a closed discipline, with its own scientifific jargon and academic objectives that favor analytic proofs rather than practical methods for learning from data.


PREFACE


Historically, we can identify three stages in the development of predictive learning methods. First, in 1985–1992 classical statistics gave way to neural networks(and other empirical methods, such as fuzzy systems) due to an early enthusiasmand naive claims that biologically inspired methods (i.e., neural nets) can achievemodel-free learning not subject to statistical limitations. Even though such claims later proved to be false, this stage had a positive impact by showing the power and usefulness of flflexible nonlinear modeling based on the risk minimization approach. Then in 1992–1996 came the return of statistics as the researchers and practitioners of neural networks became aware of their statistical limitations, initiating a trend toward interpretation of learning methods using a classical statistical framework. Finally, the third stage, from 1997 to present, is dominated by the wide popularity of support vector machines (SVMs) and similar margin-based approaches (such as boosting), and the growing interest in the Vapnik–Chervonenkis (VC) theoretical framework for predictive learning.


This book is intended for readers with varying interests, including researchers/practitioners in data modeling with a classical statistics background, researchers/practitioners in data modeling with a neural network background, and graduate students in engineering or computer science.


The presentation does not assume a special math background beyond a good working knowledge of probability, linear algebra, and calculus on an undergraduate level. Useful background material on optimization and linear algebra is included in Appendixes A and B, respectively. We do not provide mathematical proofs, but, whenever possible, in place of proofs we provide intuitive explanations and arguments. Likewise, mathematical formulation and discussion of the major concepts and results are provided as needed. The goal is to provide a unifified treatment of diverse methodologies (i.e., statistics and neural networks), and to that end we carefully defifine the terminology used throughout the book. This book is not easy reading because it describes fairly complex concepts and mathematical models for solving inherently diffificult (ill-posed) problems of learning with fifinite data. To aid the reader, each chapter starts with a brief overview of its contents. Also, each chapter is concluded with a summary containing an overview of open research issues and pointers to other (relevant) chapters.


Book chapters are conceptually organized into three parts:


 Part I: Concepts and Theory (Chapters 1–4). Following an introduction and motivation given in Chapter 1, we present formal specifification of the inductive learning problem in Chapter 2 that also introduces major concepts and issues in learning from data. In particular, it describes an important concept called an inductive principle. Chapter 3 describes the regularization (or penalization) framework adopted in statistics. Chapter 4 describes Vapnik’s statistical learning theory (SLT), which provides the theoretical basis for predictive learning with fifinite data. SLT, aka VC theory, is important for understanding various learning methods developed in neural networks, statistics, and pattern recognition, and for developing new approaches, such as SVMs(described in Chapter 9) and noninductive learning settings (described in Chapter 10).


 Part II: Constructive Learning Methods (Chapters 5–8). This part describes learning methods for regression, classifification, and density approximation problems. The objective is to show conceptual similarity of methods originating from statistics, neural networks, and signal processing and to discuss their relative advantages and limitations. Whenever possible, we relate constructive learning methods to the conceptual framework of Part I. Chapter 5 describes nonlinear optimization strategies commonly used in various methods. Chapter 6 describes methods for density approximation, which include statistical, neural network, and signal processing techniques for data reduction and dimensionality reduction. Chapter 7 provides descriptions of statistical and neural network methods for regression. Chapter 8 describes methods for classifification.



 Part III: VC-Based Learning Methodologies (Chapters 9 and 10). Here we describe constructive learning approaches that originate in VC theory. These include SVMs (or margin-based methods) for several inductive learning problems (in Chapter 9) and various noninductive learning formulations (described in Chapter 10).



The chapters should be followed in a sequential order, as the description of constructive learning methods is related to the conceptual framework developed in the fifirst part of the book. A shortened sequence of Chapters 1–3 followed by Chapters 5, 6, 7 and 8 is recommended for the beginning readers who are interested only in the description of statistical and neural network methods. This sequence omits the mathematically and conceptually challenging Chapters 4 and 9. Alternatively, more advanced readers who are primarily interested in SLT and SVM methodology may

adopt the sequence of Chapters 2, 3, 4, 9, and 10.



In the course of writing this book, our understanding of the fifield has changed. We started with the currently prevailing view of learning methods as a collection of tricks. Statisticians have their own bag of tricks (and terminology), neural networks have a different set of tricks, and so on. However, in the process of writing this book, we realized that it is possible to understand the various heuristic methods (tricks) by a sound general conceptual framework. Such a framework is provided by SLT developed mainly by Vapnik over the past 35 years. This theory combines fundamental concepts and principles related to learning with fifinite data, welldefifined problem formulations, and rigorous mathematical theory. Although SLT is well known for its mathematical aspects, its conceptual contributions are not fully appreciated. As shown in our book, the conceptual framework provided by

SLT can be used for improved understanding of various learning methods even where its mathematical results cannot be directly applied. Modern learning methods (i.e., flflexible approaches using fifinite data) have slowly drifted away from the original problem statements posed in classical statistical decision and estimation theory. A major conceptual contribution of SLT is in revisiting the problem statement appropriate for modern data mining applications. On the very basic level,SLT makes a clear distinction between the problem formulation and a solution approach (aka inductive principle) used to solve a problem. Although this distinction appears trivial on the surface, it leads to a fundamentally new understanding of the learning problem not explained by classical theory. Although it is tempting to skip directly to constructive solutions, this book devotes enough attention to the learning problem formulation and important concepts before describing actual learning methods.



【下载地址】

链接:https://pan.baidu.com/s/1RvRu23RExRcuxWqSbzj2CQ

提取码:9q1h




相关文章


R基础及应用-大数据分析(高清PDF 下载)

为了更好地适应新形势,满足读者对大数据分析处理学习的迫切需要,我们推出了《大数据分析 ∶ R基础及应用》一书 ,力求使读者能够从中了解大数据

《R数据科学》高清中/英文版PDF+源代码

读完本书后,你将掌握R语言的精华,并能够熟练使用多种工具来解决各种数据科学难题。

用Python写网络爬虫(高清PDF 下载)

网络爬虫是一个自动提取网页的程序,它为搜索引擎从万维网上下载网页,是搜索引擎的重要组成。传统爬虫从一个或若干初始网页的URL开始, 获得初始

用Python进行自然语言处理(高清PDF 下载)

通过它,你将学到如何写能处理大量非结构化文本的 Python 程序。你将获得有丰富标注的涵盖语言学各种数据结构的数据集,而且你将学到分析书面

简明python教程(高清PDF下载)

本书可以作为Python编程语言的一本指南或者教程。它主要是为新手而设计,不过对于有经验的程序员来说,它同样有用。

集体智慧编程-python算法应用(高清PDF 下载)

本书以机器学习与计算统计为主题背景,专门讲t述如何挖掘和分析 Web,上的数据和资源,如何分析用户体验、市场营销、个人品味等诸多信息,并得出

编程小白的第一本+python+入门书(高清PDF下载)

为了能让更多的编程小自轻松地入门编程,我把高效学习法结合 Pvthon 中的核心知识,写成了这本书。随意翻上几页,你就会发现这本书和其他编程

笨办法学.Python.(第三版)(高清PDF下载)

本书结构非常简单,其实就是 52 个习题。其中 26 个覆盖了输入输出、变量、以及函数三个课题,另外 26个覆盖了一些比较高级的话题,如条件

Python源码剖析(高清PDF 下载)

本书以CPython为研究对象,在C代码一级,深入细致地剖析了Python的实现。本书不仅包括了对大量Python内置对象的剖析,更将大量的

Python学习手册(第4版)(中文版高清PDF 下载)

本书是学习Python编程语言的入门书籍。Python是一种很流行的开源编程语言,可以在各种领域中用于编写独立的程序和脚本。Python免费

Python算法教程_中文版(高清PDF下载)

本书用 Python 语言来讲解算法的分析和设计。本书主要关注经典的算法,但同时会为读者理解基本算法问题和解决问题打下很好的基础。全书共 1

Python数据分析基础(高清PDF下载)

本书面向的读者是那些经常使用电子表格软件进行数据处理,但从未写过一行代码的人。前几章会教你设置 Python 运行环境,告诉你计算机是如何看

Python神经网络编程中英文(高清PDF下载)

神经网络是一种模拟人脑的神经网络,以期能够实现类人工智能的机器学习技术。本书揭示神经网络背后的概念,并介绍如何通过Pvthon实现神经网络。

Python入门指南 (中英文版高清PDF下载)

Python 是一种容易学习的强大的编程语言。它包含了高效的高级数据结构,能够用简单而高效的方式进行面向对象编程。Python 优雅的语法和

PYTHON入门经典_超高清pdf

本书是面向 Python 初学者的学习指南,详细介绍了 Python 编程基础,以及一些高级概念,如面向对象编程。


文章热度: 166291
文章数量: 333
推荐阅读

FlashFXP绿色版网盘下载,附激活教程 1066

FlashFxp百度网盘下载链接:https://pan.baidu.com/s/1MBQ5gkZY1TCFY8A7fnZCfQ。FlashFxp是功能强大的FTP工具

Adobe Fireworks CS6 Ansifa绿色精简版网盘下载 798

firework可以制作精美或是可以闪瞎眼的gif,这在广告领域是需要常用的,还有firework制作下logo,一些原创的图片还是很便捷的,而且fireworks用法简单,配合dw在做网站这一块往往会发挥出很强大的效果。百度网盘下载链接:https://pan.baidu.com/s/1fzIZszfy8VX6VzQBM_bdZQ

navicat for mysql中文绿色版网盘下载 846

Navicat for Mysql是用于Mysql数据库管理的一款图形化管理软件,非常的便捷和好用,可以方便的增删改查数据库、数据表、字段、支持mysql命令,视图等等。百度网盘下载链接:https://pan.baidu.com/s/1T_tlgxzdQLtDr9TzptoWQw 提取码:y2yq

火车头采集器(旗舰版)绿色版网盘下载 1173

火车头采集器是站长常用的工具,相比于八爪鱼,简洁好用,易于配置。火车头能够轻松的抓取网页内容,并通过自带的工具对内容进行处理。站长圈想要做网站,火车头采集器是必不可少的。百度网盘链接:https://pan.baidu.com/s/1u8wUqS901HgOmucMBBOvEA

Photoshop(CS-2015-2023)绿色中文版软件下载 1484

安装文件清单(共46G)包含Window和Mac OS各个版本的安装包,从cs到cc,从绿色版到破解版,从安装文件激活工具,应有尽有,一次性打包。 Photoshop CC绿色精简版 Photoshop CS6 Mac版 Photoshop CC 2015 32位 Photoshop CC 2015 64位 Photoshop CC 2015 MAC版 Photoshop CC 2017 64位 Adobe Photoshop CC 2018 Adobe_Photoshop_CC_2018 Photoshop CC 2018 Win32 Photoshop CC 2018 Win64

知之

知之平台是全球领先的知识付费平台。提供各个领域的项目实战经验分享,提供优质的行业解决方案信息,来帮助您的工作和学习

使用指南 建议意见 用户协议 友情链接 隐私政策 Powered by NOOU ©2020 知之