Questions tagged [amazon-emr]
4 questions
4
votes
0 answers
Reading data from Amazon redshift in Spark 2.4
We used to read data in Spark 2.3 using databricks with the following code segment
Spark-Shell initialization :
spark-shell --jars RedshiftJDBC42-1.2.10.1009.jar --packages…
1
vote
1 answer
Can not connect to Amazon EMR cluster with PuTTY
I created EMR cluster with standard configuration.
Then I allowed inbound SSH traffic on port 22 for the corresponding security group. I added the following rules:
Then I followed the instructions:
But I am getting the error:
Server refused our…
Andrey
- 111
1
vote
0 answers
How to read large zip files in pyspark
I do have n number of .zip files on s3, which I want to process and extract some data out of them. zip files contains a single json file. In spar we can read .gz files, but I didn't find any way to read data within .zip files. Can someone please…
Sandie
- 111
1
vote
1 answer
How to add an EBS volume by snapshot ID to Amazon EMR
We have a large amount of data on an EBS volume. I am familiar with attaching the volume to a new EC2 cluster.
But how is this done for EMR ? Here is the Add Storage dialog: notice there is no entries for specifying the EBS Snapshot ID:
WestCoastProjects
- 4,032