Below is the code snippet to get the file extension
public static Map<String, String> getImageType(byte[] imageContent) throws Exception { Map<String, String> result = new HashMap<String, String>(); TikaConfig config = TikaConfig.getDefaultConfig(); MediaType mediaType = config.getMimeRepository().detect(new ByteArrayInputStream(imageContent), new Metadata()); MimeType mimeType = config.getMimeRepository().forName(mediaType.toString()) result.put("extn", mimeType.getExtension().replaceFirst(".", "")); return result; }
tika-core dependency used version 1.13
Above code is working fine for all the document having extension .pdf, .jpg, .csv, .txt, .bin, .jpeg and return expected extension Ex {extn=.pdf}, But it is not working when the file extension is .doc and .docx and return blank {extn=}
Expecting to get the .doc and .docx file extension ex:{extn=.doc}