GlusterVolume modification to wrap getPos() functionality in underlying ... #59

jayunit100 · 2013-09-16T01:09:33Z

This PullRequest Wraps calls to get pos and records the last value, returning that last value in the case that the underlying stream closes. This fix is essential for distributed mapreduce jobs that use the mapred.* API, because that API expects a working call to FSInputStream.getPos() even after the stream has been read: And because the RawLocalFileSystem buffers that stream (unlike DFSInputStream, which handles doesnt wrap the stream), the buffer closes and sets the underlying stream to null . See BZ 100362 for details.

…ng bufferstream

childsb · 2013-09-18T17:51:01Z

A typical file I/O workflow goes:
openFile();
fileIOOperations()
closeFile()

any type of I/O after closeFile() is invalid. GlusterVolume class is correctly throwing an exception if an attempt is made to access the file's stream position after it's been closed. there is no stream open, thus there is no valid position after the stream is closed, and calling getPos() on a closed stream means the open/read/close flow is broken further up the calling chain.

i dont like nurfing up the code by catching exceptions and not reporting them. the responsibility is on the caller NOT to attempt file I/O operations after its closed the file or stream. if the caller attempts to do so, we're justified throwing exception.

i would like to see the open(), read(), close(), getPos() code thats exhibiting this incorrect I/O workflow before OKing guarding against the behavior at a file system level.

jayunit100 · 2013-09-18T18:08:08Z

To clarify brad's comment: IT is true that the bug can be thought of as existing in the RR implementation in mapred version of MultiFileWordCount, and not necessarily a bug in our shim. This allows you to call "getPos()" so that the behaviour is similar to DFSClient stream - always returning a valid long. without the patch, the version of MultiFileWordCount in hadoop-examples will not run. Im in agreement that the overlying "bug" is in the older MultiFileWordCount impl, but nevertheless, the older mapred.* MultiFileWordCount impl does work on HDFS due to the way the DFSClient is implemented.

jayunit100 · 2013-09-18T18:27:03Z

The offending code in RR for MultiFileWordCount violates the access pattern, here is why. (i agree its not the right way to use a stream). But in any case, here is the code path brad requests to see:

By the way, some of this code is JDK dependant also, see how streams get set to null whenclosed : http://grepcode.com/file_/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/io/BufferedReader.java/?v=source

RR calls getPos() BEFORE and AFTER reading a record, in the TrackingRecordReader implementation which in mapred.
At the first call, before the underlying stream is closed, the getPos() returns something valid. Then, the stream is closed, and the implementation in the MultiFielWordCount custom RR returns the result of reading getPos from the "currentStream".
But, since currentStream is actually wrapped by buffered stream, which sets currentStream to null when its closed, the 2nd call to getPos() fails in NPE.

So the code path is that

#In the MapTask in mapred.* framework, hadoop 1.X
TrackingRecordReader calls MultiFileWordCountRecordReader.getPos

#In the MultiFileWordCount MapReduce job - in hadoop-examples for 1.X
MultiFileWordCountRecordReader.getPos calls currentStream getPos

public long gc long getPos() throws IOException {
etPos() throws IOException {
long currentOffset = currentStream == null ? 0 : currentStream.getPos();
return offset + currentOffset;
}

and currentStream has set to null already, by the overlying BufferedReader : since the currentReader is closed and currentreader is an instance of BufferedReader :

#Why is NPE thrown ? Because of the way bufferedreader works... Finally, from src/examples/org/apache/hadoop/examples/MultiFileWordCount.java
currentReader=new BufferedReader(new InputStreamReader(currentStream));

childsb · 2013-09-18T19:58:06Z

will you walk through error/code in TrackingRecordReader by line number?

jayunit100 · 2013-10-07T16:48:40Z

From ./src/examples/org/apache/hadoop/examples/MultiFileWordCount.java in hadoop 1.2:

151 public long getPos() throws IOException {
152 long currentOffset = currentStream == null ? 0 : currentStream.getPos();
153 return offset + currentOffset;
154 }

The error is thrown at getPos(), because cuurentStream is not null, but the inputStreamReader is null. makes sense?

childsb · 2013-10-07T17:03:20Z

per the code you pasted in, if the stream is null "0" is assumed; if its not then position is queried.

this shows that the MuleFileWordCount is guarding somewhat that a position is returned even if a stream doesn't exist, so i dont see why the GlusterVolume code should be doing the same thing.

jayunit100 · 2013-10-07T17:04:22Z

Alas, your right ! The problem is that currentStream is not null.
However, the InputStreamReader IS set to null.
That is the problem :) !

jayunit100 · 2013-10-07T17:07:04Z

does it make sense now? The currentStream still exists, but the overlying wrapper (the Buffer) SETs the input stream reader to null.

Thus, when the input stream reader is accesssed, a NPE is thrown. But the logic in this method only checks the current stream.

this is NOT a problem in HDFS, because in HDFS, the underlying stream doesnt proxy calls to getPos().

jayunit100 · 2013-10-07T22:45:14Z

https://issues.apache.org/jira/browse/MAPREDUCE-5572 <--- corresponding jira to point to the root cause which is a custom RecordReader which relies on undefined FileSYstem semantics

childsb · 2013-10-09T19:06:29Z

https://bugzilla.redhat.com/show_bug.cgi?id=1003626

GlusterVolume modification to wrap getPos() functionality in underlyi…

5630180

…ng bufferstream

jayunit100 force-pushed the master branch from faea42e to 900d0a7 Compare August 26, 2014 19:21

avati force-pushed the master branch 2 times, most recently from 955da2a to 9cab8ab Compare February 6, 2015 03:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GlusterVolume modification to wrap getPos() functionality in underlying ... #59

GlusterVolume modification to wrap getPos() functionality in underlying ... #59

jayunit100 commented Sep 16, 2013

childsb commented Sep 18, 2013

jayunit100 commented Sep 18, 2013

jayunit100 commented Sep 18, 2013

childsb commented Sep 18, 2013

jayunit100 commented Oct 7, 2013

childsb commented Oct 7, 2013

jayunit100 commented Oct 7, 2013

jayunit100 commented Oct 7, 2013

jayunit100 commented Oct 7, 2013

childsb commented Oct 9, 2013

GlusterVolume modification to wrap getPos() functionality in underlying ... #59

Are you sure you want to change the base?

GlusterVolume modification to wrap getPos() functionality in underlying ... #59

Conversation

jayunit100 commented Sep 16, 2013

childsb commented Sep 18, 2013

jayunit100 commented Sep 18, 2013

jayunit100 commented Sep 18, 2013

childsb commented Sep 18, 2013

jayunit100 commented Oct 7, 2013

childsb commented Oct 7, 2013

jayunit100 commented Oct 7, 2013

jayunit100 commented Oct 7, 2013

jayunit100 commented Oct 7, 2013

childsb commented Oct 9, 2013