您现在的位置是:首页 > 博文答疑 > Spark development in Windows博文答疑
Spark development in Windows
Zack2017-05-12【8】
简介迈出Spark开发第一步
Cover 4 major items in this doc:
1. Simulate Hadoop in Windows.
2. How to install Spark to Windows.
3. How to install Scala-IDE to Windows.
4. How to package Scala code into jar via SBT and run it on Spark.
Simulate Hadoop in Windows:
Download winutils.exe from official web:
https://sundog-spark.s3.amazonaws.com/winutils.exe
Install Spark to Windows:
1. Down load from office web:
http://spark.apache.org/downloads.html
2. Un-zip to folder C:\spark
3. Set SPARK_HOME and PATH in SYSTEM user variables:
4. Create the user PATH:
5. Verify the install is successful:
Command in CMD folder C:\spark\bin
‘ spark-shell’
Install Scala IDE eclipse to Windows:
1. Download zip from official web:
2. Un-zip it to C:\eclipse
3. You should have proper JRE/JDK installed properly. Then open ‘eclipse.exe’.
Choose the workspace new created folder ‘C:\SparkScala’
4. Then you can create do ff.
a, new Scala Project:
b, new create Package under src
c, new scala code file
Install SBT to your PC:
1. Download SBT from official web:
Download ZIP or TGZ package and expand it.
2. 将下载的包解压到你指定的目录, 比如解压到d:\sbt
3. 在sbt\bin目录下创建sbtconfig.txt文件
4. Set SBT_HOME and PATH in SYSTEM user variables:
5. Create the user PATH:
6. First run to download jar packages, which will take a quite long time.
Command: sbt command in the lib d:\sbt
Ctrl + C to stop if any issue.
7. Creates a jar file using command ‘sbt package’
写好的scala代码,放到如下的文件结构里:
\test\src\main\scala\SimpleApp.scala
sbt配置文件放到根文件里:
\test\simple.sbt
c. can find the jar location from log:
D:\sbt\test\target\scala-2.11\simple-project_2.11-1.0.jar
d. run the jar in Spark:
a) Copy the jar into Spark bin lib:
C:\spark\bin\simple-project_2.11-1.0
b) Command in CMD C:\spark\bin\:
‘spark-submit simple-project_2.11-1.0.jar’
c) Show below results:
找到文件中有几个a和几个b。
Lines with a: 62. Lines with b: 30