Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
Yikun Jiang 2023-06-26 20:15:49 +08:00
parent a16cd1ae80
commit 1777da678f
3 changed files with 12 additions and 7 deletions

View File

@ -1 +1 @@
Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
Apache Spark - A unified analytics engine for large-scale data processing

View File

@ -5,19 +5,20 @@ Apache Spark™ is a multi-language engine for executing data engineering, data
%%LOGO%%
## Online Documentation
You can find the latest Spark documentation, including a programming guide, on the [project web page](https://spark.apache.org/documentation.html). This README file only contains basic setup instructions.
## Interactive Scala Shell
The easiest way to start using Spark is through the Scala shell:
```
```console
docker run -it spark /opt/spark/bin/spark-shell
```
Try the following command, which should return 1,000,000,000:
```
```scala
scala> spark.range(1000 * 1000 * 1000).count()
```
@ -25,13 +26,13 @@ scala> spark.range(1000 * 1000 * 1000).count()
The easiest way to start using PySpark is through the Python shell:
```
```console
docker run -it spark:python3 /opt/spark/bin/pyspark
```
And run the following command, which should also return 1,000,000,000:
```
```python
>>> spark.range(1000 * 1000 * 1000).count()
```
@ -39,7 +40,7 @@ And run the following command, which should also return 1,000,000,000:
The easiest way to start using R on Spark is through the R shell:
```
```console
docker run -it apache/spark-r /opt/spark/bin/sparkR
```
@ -47,3 +48,7 @@ docker run -it apache/spark-r /opt/spark/bin/sparkR
https://spark.apache.org/docs/latest/running-on-kubernetes.html
## Configuration and environment variables
See more in https://github.com/apache/spark-docker/blob/master/OVERVIEW.md#environment-variable

View File

@ -1 +1 @@
https://issues.apache.org/jira/browse/SPARK
https://issues.apache.org/jira/browse/SPARK