Back

Technologies:

javascriptjavascript
typescripttypescript
avatar
Tolerim
23 days ago

The byte array in JavaScript is always different from the byte array in Java.

Currently, I'm working on converting an image file into a Blob object in JavaScript. Next, I need to obtain the byte array of the Blob object by writing code as follows:

public async blobToSignedEightByteArray(blob: Blob): Promise<Int8Array> {
  return new Promise((resolve, reject) => {
    const reader = new FileReader();

    reader.onload = () => {
      const arrayBuffer = reader.result;
      // @ts-ignore
      const byteArray = new Int8Array(arrayBuffer);

      resolve(byteArray);
    };

    reader.onerror = reject;

    reader.readAsArrayBuffer(blob);
  });
}
After retrieving the byte array, I decode it to UTF-8 using the following code:

let byteData = await this.blobToSignedEightByteArray(blob);
return new TextDecoder('utf-8').decode(byteData); //some result
However, I am noticing that when I use the same image file with a Java method-getFileBytes that accepts a MultipartFile object, and use new String(byteData) to obtain a string from byteData, I am getting a result that differs entirely from what I was getting from the JavaScript code. What could be the cause of this difference?

Answers(2)
avatar
Vogi
22 days ago
Verified Answer
The difference between TextDecoder and new String() in Java is that TextDecoder is a JavaScript API that decodes a byte sequence into a string of characters using a specified character encoding. On the other hand, new String() in Java creates a new string object by allocating memory for it and initializing it with the contents of the specified character array. The difference in the output you are getting could be due to the difference in the way these two methods work. In your case, you are using new String(byteData) to obtain a string from byteData in Java. This method creates a new string object by allocating memory for it and initializing it with the contents of the specified byte array. It uses the platform's default charset to decode the bytes into characters. If you want to use UTF-8 encoding, you can use new String(byteData, StandardCharsets.UTF_8) instead. I hope this helps!

Sources

java - What is the difference between "text" and new String("text ...difference between new String [] {} and new String [] in javaString Initialization in Java | Baeldung
avatar
Tolerim
23 days ago
The issue could be related to the encoding used in the Java new String(byteData) method. By default, this method will decode the byte array using the system's default character set, which may not necessarily be UTF-8. To ensure consistency between the JavaScript and Java code, you should specify the encoding used when decoding the byte array in both cases. In the Java code, you can use the String constructor that takes an explicit encoding argument:
private String getFileData(MultipartFile file) {
    try {
        byte[] byteArray = file.getBytes();
        return new String(byteArray, StandardCharsets.UTF_8);
    } catch (IOException e) {
        log.error(e.getMessage());
        return null;
    }
}
This code explicitly specifies that the byte array should be decoded as a UTF-8 string. Similarly, in the JavaScript code, you can pass the encoding as a parameter to the TextDecoder constructor:
let byteData = await this.blobToSignedEightByteArray(blob);
return new TextDecoder('utf-8').decode(byteData);
By specifying the encoding explicitly in both cases as UTF-8, you should be able to ensure that the same bytes are being decoded into the same string representation.
;