Note: All info in my post only goes for Samsung Galaxy S7 device. I do not know how emulators and other devices behave.
In onImageAvailable I convert continuously each image to a NV21 byte array and forward it to an API expecting raw NV21 format.
This is how I initialize the image reader and receive the images:
private void openCamera() {
    ...
    mImageReader = ImageReader.newInstance(WIDTH, HEIGHT,
            ImageFormat.YUV_420_888, 1); // only 1 for best performance
    mImageReader.setOnImageAvailableListener(
    mOnImageAvailableListener, mBackgroundHandler);
    ...
}
private final ImageReader.OnImageAvailableListener mOnImageAvailableListener
        = new ImageReader.OnImageAvailableListener() {
    @Override
    public void onImageAvailable(ImageReader reader) {
        Image image = reader.acquireLatestImage();
        if (image != null) {
            byte[] data = convertYUV420ToNV21_ALL_PLANES(image); // this image is turned 90 deg using front cam in portrait mode
            byte[] data_rotated = rotateNV21_working(data, WIDTH, HEIGHT, 270);
            ForwardToAPI(data_rotated); // image data is being forwarded to api and received later on
            image.close();
        }
    }
};
The function converting the image to raw NV21 (from here), working fine, the image is (due to android?) turned by 90 degrees when using front cam in portrait mode: (I modified it, slightly according to comments of Alex Cohn)
private byte[] convertYUV420ToNV21_ALL_PLANES(Image imgYUV420) {
    byte[] rez;
    ByteBuffer buffer0 = imgYUV420.getPlanes()[0].getBuffer();
    ByteBuffer buffer1 = imgYUV420.getPlanes()[1].getBuffer();
    ByteBuffer buffer2 = imgYUV420.getPlanes()[2].getBuffer();
    // actually here should be something like each second byte
    // however I simply get the last byte of buffer 2 and the entire buffer 1
    int buffer0_size = buffer0.remaining();
    int buffer1_size = buffer1.remaining(); // / 2 + 1;
    int buffer2_size = 1;//buffer2.remaining(); // / 2 + 1;
    byte[] buffer0_byte = new byte[buffer0_size];
    byte[] buffer1_byte = new byte[buffer1_size];
    byte[] buffer2_byte = new byte[buffer2_size];
    buffer0.get(buffer0_byte, 0, buffer0_size);
    buffer1.get(buffer1_byte, 0, buffer1_size);
    buffer2.get(buffer2_byte, buffer2_size-1, buffer2_size);
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    try {
        // swap 1 and 2 as blue and red colors are swapped
        outputStream.write(buffer0_byte);
        outputStream.write(buffer2_byte);
        outputStream.write(buffer1_byte);
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    rez = outputStream.toByteArray();
    return rez;
}
Hence "data" needs to be rotated. Using this function (from here), I get a weird 3-times interlaced picture error:
public static byte[] rotateNV21(byte[] input, int width, int height, int rotation) {
    byte[] output = new byte[input.length];
    boolean swap = (rotation == 90 || rotation == 270);
    // **EDIT:** in portrait mode & front cam this needs to be set to true:
    boolean yflip = true;// (rotation == 90 || rotation == 180);
    boolean xflip = (rotation == 270 || rotation == 180);
    for (int x = 0; x < width; x++) {
        for (int y = 0; y < height; y++) {
            int xo = x, yo = y;
            int w = width, h = height;
            int xi = xo, yi = yo;
            if (swap) {
                xi = w * yo / h;
                yi = h * xo / w;
            }
            if (yflip) {
                yi = h - yi - 1;
            }
            if (xflip) {
                xi = w - xi - 1;
            }
            output[w * yo + xo] = input[w * yi + xi];
            int fs = w * h;
            int qs = (fs >> 2);
            xi = (xi >> 1);
            yi = (yi >> 1);
            xo = (xo >> 1);
            yo = (yo >> 1);
            w = (w >> 1);
            h = (h >> 1);
            // adjust for interleave here
            int ui = fs + (w * yi + xi) * 2;
            int uo = fs + (w * yo + xo) * 2;
            // and here
            int vi = ui + 1;
            int vo = uo + 1;
            output[uo] = input[ui];
            output[vo] = input[vi];
        }
    }
    return output;
}
Resulting into this picture:
Note: it is still the same cup, however you see it 3-4 times.
Using another suggested rotate function from here gives the proper result:
public static byte[] rotateNV21_working(final byte[] yuv,
                                final int width,
                                final int height,
                                final int rotation)
{
  if (rotation == 0) return yuv;
  if (rotation % 90 != 0 || rotation < 0 || rotation > 270) {
    throw new IllegalArgumentException("0 <= rotation < 360, rotation % 90 == 0");
  }
  final byte[]  output    = new byte[yuv.length];
  final int     frameSize = width * height;
  final boolean swap      = rotation % 180 != 0;
  final boolean xflip     = rotation % 270 != 0;
  final boolean yflip     = rotation >= 180;
  for (int j = 0; j < height; j++) {
    for (int i = 0; i < width; i++) {
      final int yIn = j * width + i;
      final int uIn = frameSize + (j >> 1) * width + (i & ~1);
      final int vIn = uIn       + 1;
      final int wOut     = swap  ? height              : width;
      final int hOut     = swap  ? width               : height;
      final int iSwapped = swap  ? j                   : i;
      final int jSwapped = swap  ? i                   : j;
      final int iOut     = xflip ? wOut - iSwapped - 1 : iSwapped;
      final int jOut     = yflip ? hOut - jSwapped - 1 : jSwapped;
      final int yOut = jOut * wOut + iOut;
      final int uOut = frameSize + (jOut >> 1) * wOut + (iOut & ~1);
      final int vOut = uOut + 1;
      output[yOut] = (byte)(0xff & yuv[yIn]);
      output[uOut] = (byte)(0xff & yuv[uIn]);
      output[vOut] = (byte)(0xff & yuv[vIn]);
    }
  }
  return output;
}
The result is fine now:
The top image shows the direct stream using a texture view's surface and adding it to the captureRequestBuilder. The bottom image shows the raw image data after rotating.
The questions are:
- Does this hack in "convertYUV420ToNV21_ALL_PLANES" work on any device/emulator?
 - Why does rotateNV21 not work, while rotateNV21_working works fine.
 
Edit: The mirror issue is fixed, see code comment. The squeeze issue is fixed, it was caused by the API it gets forwarded. The actual open issue is a proper not too expensive function, converting and rotating an image into raw NV21 working on any device.

