Mining the Social Web

How can you tap into the wealth of social web data to discover who's making connections with whom, what they're talking about, and where theyre located? With this expanded and thoroughly revised edition,you'll learnhow toacquire.analyze and summarize data from all corners of the social web including Facebook, Twitter, LinkedIn, Google+,GitHub,email websites, and blogs.

■ Employ IPython Notebook,the NaturalLanguage Toolkit, NetworkX,and other scientific computing tools tomine popular social websites

■Apply advanced text-mining techniques, suchasclustering and TF-IDF,to extract meaning from humanlanguage data

■ Bootstrap interest graphs from GitHub by discovering affinities amongpeople.programminglanguages,and coding projects Build interactive visualizations with D3.js,anextraordinarily flexible HTML5 and JavaScript toolkit

■ Take advantage of more than two-dozen Twitter recipes, presented in O'Reilly's popular"problem/solution/discussion" cookbook format

The examplecode for this unique data science book is maintained ina public GitHub repository.It's designed to be easily accessible througha turnkeyvirtual machine that facilitates interactivelearningwith aneasy-to- use collection of IPython Notebooks.

Learn how to turn data into decisions

From startups to the Fortune 500, smart companies are betting on data-driven insight, seizing the opportunities that are emerging from the convergence of four powerful trends:

New methods of collecting, managing, and analyzing data

Cloud computing that offers inexpensive storage and flexible, on-demand computing power for massive data sets

Visualization techniques that turn complex data into images that tell a compelling story

Tools that make the power of data available to anyone Get control over big data and turn it into insight with O’Reilly’s Strata offerings. Find the inspiration and information to create new products or revive existing ones, understand customer behavior, and get the data edge.


This book has been carefully designed to provide an incredible learning experience for a particular target audience, and in order to avoid any unnecessary confusion about its scope or purpose by way of disgruntled emails, bad book reviews, or other misunderstandings that can come up, the remainder of this preface tries to help you determine whether you are part of that target audience. As a very busy professional, I consider my

time my most valuable asset, and I want you to know right from the beginning that I believe that the same is true of you. Although I often fail, I really do try to honor my neighbor above myself as I walk out this life, and this preface is my attempt to honor you, the reader, by making it clear whether or not this book can meet your expectations.

Managing Your Expectations

Some of the most basic assumptions this book makes about you as a reader is that you want to learn how to mine data from popular social web properties, avoid technology hassles when running sample code, and have lots of fun along the way. Although you could read this book solely for the purpose of learning what is possible, you should know up front that it has been written in such a way that you really could follow along with the many exercises and become a data miner once you’ve completed the few simple steps to set up a development environment. If you’ve done some programming before, you should find that it’s relatively painless to get up and running with the code examples. Even if you’ve never programmed before but consider yourself the least bit tech-savvy, I daresay that you could use this book as a starting point to a remarkable journey that will stretch your mind in ways that you probably haven’t even imagined yet.

To fully enjoy this book and all that it has to offer, you need to be interested in the vast possibilities for mining the rich data tucked away in popular social websites such as Twitter, Facebook, LinkedIn, and Google+, and you need to be motivated enough to download a virtual machine and follow along with the book’s example code in IPython Notebook, a fantastic web-based tool that features all of the examples for every chapter. Executing the examples is usually as easy as pressing a few keys, since all of the code is presented to you in a friendly user interface. This book will teach you a few things that you’ll be thankful to learn and will add a few indispensable tools to your toolbox, but perhaps even more importantly, it will tell you a story and entertain you along the way. It’s a story about data science involving social websites, the data that’s tucked away insideof them, and some of the intriguing possibilities of what you (or anyone else) could do with this data.

If you were to read this book from cover to cover, you’d notice that this story unfolds on a chapter-by-chapter basis. While each chapter roughly follows a predictable tem‐ plate that introduces a social website, teaches you how to use its API to fetch data, and introduces some techniques for data analysis, the broader story the book tells crescendos in complexity. Earlier chapters in the book take a little more time to introduce funda‐mental concepts, while later chapters systematically build upon the foundation from earlier chapters and gradually introduce a broad array of tools and techniques for mining the social web that you can take with you into other aspects of your life as a data scientist, analyst, visionary thinker, or curious reader.

Some of the most popular social websites have transitioned from fad to mainstream to household names over recent years, changing the way we live our lives on and off the Web and enabling technology to bring out the best (and sometimes the worst) in us. Generally speaking, each chapter of this book interlaces slivers of the social web along with data mining, analysis, and visualization techniques to explore data and answer the following representative questions:

• Who knows whom, and which people are common to their social networks?

• How frequently are particular people communicating with one another?

• Which social network connections generate the most value for a particular niche?

• How does geography affect your social connections in an online world?

• Who are the most influential/popular people in a social network?

• What are people chatting about (and is it valuable)?

• What are people interested in based upon the human language that they use in a digital world?

【Download link】


Extraction code:u7rt


