How do I change the dataset when I Fine-Tuning the Whisper model?

I tried to fine-tune the Whisper model by referring to the article. If want to refer to the code, please look at the colab link.

All I want to do is change the common-voice dataset used in the article to my dataset.

I use a prepared common-voice dataset, it works very well. The common-voice dataset appears to use a pre-cached .arrow file.

Because of this, it is fast, and the whole process is handled well. But using my dataset does not work.

Specifically, it takes a lot of time in the code below and does not work.

common_voice = common_voice.map(prepare_dataset, remove_columns=common_voice.column_names["train"], num_proc=2)

In my opinion, this is due to the raw data that was pre-cached. I import the dataset with the simple code below.

My code does not create an array cache file of voice files.

class DataLoader_AIHub:    def __init__(self, rootPath):        self.rootPath = rootPath    def getData(self, max_files_to_load, startPoint=0):        rootPath_audio = os.path.join(self.rootPath, 'audio')        audioDirPaths = getDirList(rootPath_audio)        total_files_loaded = 0        data_list = []        for audioDir in audioDirPaths:            audioFileNames = getFileList(audioDir)            audioFilePaths = [audioDir +'/'+ str(item) for item in audioFileNames]            labelFilePaths = [item.replace('/audio/','/label/').replace('.wav','.json') for item in audioFilePaths]            for audioPath, labelPath in zip(audioFilePaths, labelFilePaths):                jsonInfo = getJson(labelPath)                if '(' in jsonInfo['발화정보']['stt']:                    continue                if startPoint > total_files_loaded:                    total_files_loaded += 1                    continue                audio, sr = sf.read(audioPath)                audioArray = audio.astype(np.float32)                dict = {'audio': {'path': audioPath,'array': audioArray,'sampling_rate': sr                    },'sentence': re.sub('\r\n', '', jsonInfo['발화정보']['stt']),'age': jsonInfo['녹음자정보']['age'],'gender': jsonInfo['녹음자정보']['gender']                }                data_list.append(dict)                total_files_loaded += 1                if total_files_loaded >= max_files_to_load + startPoint:                     return Dataset.from_list(data_list)        return Dataset.from_list(data_list)

(It is a Korean dataset.)

Voice files (.wav) are sampled at 16 kHz, and audioArray refers to an array that has been decoded. The .arrow file is presumed to store these decoding arrays.

Am I doing something wrong?

How do I change the dataset when I Fine-Tuning the Whisper model?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112