added history server

481941c0 · Stefan Kesselheim · 61b6f9b9 · 481941c0
Commit 481941c0 authored 2 years ago by Stefan Kesselheim
--- a/README.md
+++ b/README.md
@@ -25,6 +25,11 @@ python pyspark_pi.py
 ```
 Note the `i` that that has been added to the master hostname. 
+To connect to the master and workers with a browser, you need a command of the following form:
+```bash
+ssh -L 18080:localhost:18080 -L 8080:localhost:8080 kesselheim1@jwb0085i.juwels -J kesselheim1@juwels-booster.fz-juelich.de
+```
+Then you can navigate to (http://localhost:8080) to the the output. 
 Open Questions
 - In the Scala Example, is uses all worker instances as expected. The Python Example uses only 2. Why?
@@ -32,6 +37,11 @@ Open Questions
 ToDos:
 - Include a Python Virtual Environment
 - Create a Notebook that illustrates how to run the Pi example in Juypter
+- The history server does not work yet. It crashed with this error message:
+```
+Exception in thread "main" java.io.FileNotFoundException: Log directory specified does not exist: file:/tmp/spark-events Did you configure the correct one through spark.history.fs.logDirectory? 
+```
+The logdir config is not configured in the right way.
 ## References
 - Pi Estimate (Python + Scala): [](https://spark.apache.org/examples.html)