Apache Flink是一个分布式流批一体化的开源平台。Flink的核心是一个提供数据分发、通信以及自动容错的流计算引擎。Flink在流计算之上构建批
In this guide we will start from scratch and go from setting up a Flink project to runninga streaming analysis program on a Flink cluster.
wikipedia provides an IRC channel where all edits to the wiki are logged.we are going toread this channel in Flink and count the number of bytes that each user edits withina given window of time.This is easy enough to implement in a few minutes using Flink,but it willgive you a good foundation from which to start building more complex analysis programs on your own.
We are going to use a Flink Maven Archetype for creating our project structure. Pleasesee java API Quickstart for more detailsabout this.For our purposes,the command to run is this:
You can edit the grouprd ,artifactId and package if you like.With the above parameters,Maven will create a project structure that looks like this:
There is our pom.xml file that already has the Flink dependencies added in the root directory andseveral example Flink programs in src/main/java.we can delete the example programs,sincewe are going to start from scratch:
As a last step we need to add the Flink wikipedia connector as a dependency so that we canuse it in our program. Edit the dependencies section of the pom.xml so that it looks like this :
Notice the flink-connector-wikiedits2.11 dependency that was added.(This example andthe wikipedia connector were inspired by the _Hello Samza example of Apache Samza.)
It's coding time.Fire up your favorite IDE and import the Maven project or open a text editor andcreate the file src/main/java/wikiedits/wikipediaAnalysis.java:
The program is very basic now,but we will fill it in as we go. Note that I'll not giveimport statements here since IDEs can add them automatically.At the end of this section I'll showthe complete code with import statements if you simply want to skip ahead and enter that in youreditor.
The first step in a Flink program is to Create a streamExecutionEnvironment (or ExecutionEnvironment if you are writing a batch job). This can be used to set executionparameters and create sources for reading from external systems.So let's go ahead and addthis to the main method:
Apache Flink是一个分布式流批一体化的开源平台。Flink的核心是一个提供数据分发、通信以及自动容错的流计算引擎。Flink在流计
FlashFXP绿色版网盘下载,附激活教程 1839
Adobe Fireworks CS6 Ansifa绿色精简版网盘下载 1607
navicat for mysql中文绿色版网盘下载 1652
Navicat for Mysql是用于Mysql数据库管理的一款图形化管理软件,非常的便捷和好用,可以方便的增删改查数据库、数据表、字段、支持mysql命令,视图等等。百度网盘下载链接:https://pan.baidu.com/s/1T_tlgxzdQLtDr9TzptoWQw 提取码:y2yq
火车头采集器(旗舰版)绿色版网盘下载 1737
Photoshop(CS-2015-2023)绿色中文版软件下载 1858
安装文件清单(共46G)包含Window和Mac OS各个版本的安装包,从cs到cc,从绿色版到破解版,从安装文件激活工具,应有尽有,一次性打包。 Photoshop CC绿色精简版 Photoshop CS6 Mac版 Photoshop CC 2015 32位 Photoshop CC 2015 64位 Photoshop CC 2015 MAC版 Photoshop CC 2017 64位 Adobe Photoshop CC 2018 Adobe_Photoshop_CC_2018 Photoshop CC 2018 Win32 Photoshop CC 2018 Win64