PyQt5で録音機能付きwav音声分析ソフトをつくってみた

　PythonでGUIアプリを作成できるPyQt5を使って、WATLABブログでは幾度も挑戦してきた音声分析ソフトをつくってみます。今回はChatGPTのヘルプももらってドラッグ操作やダブルクリック操作の実装、マルチウィンドウ機能といった本格的なGUIアプリにしてみました。

こんにちは。wat（@watlablog）です。ここではPyQt5を使ったGUIアプリの作成例として、録音機能付きのwavファイル音声分析ソフトを紹介します！

　本内容は#大晦日ハッカソン (connpass)で実施した内容です。

目次（項目クリックでジャンプできます）

本記事の概要

モチベーション

　この記事では録音機能付きwavファイル音声分析ソフトをつくりますが、これまでもWATLABブログでは信号処理関連で似たような記事をたくさん書いてきました。ただ、やはり理想の音声処理ソフトにはほど遠いので、N番目のチャレンジとなります。

なぜPyQtを選択したか？

　これまでWATLABブログでは色々なライブラリを使ってGUIアプリを作ってきました。例えばwxPythonはGUIコードを自動生成してくれる補助ツールwxFormBuilderを使って信号処理アプリをつくったことがあります。

参考）wxPython関連の記事

・wxFormBuilderでwxPythonのGUIコード自動生成
・フレーム構築編：wxPythonで信号処理のGUIアプリをつくろう①

　また、kivyを使ってモバイル適用を意識したFFTアナライザーをつくったりもしました。kivyはモダンなUIが揃っていることで最近人気があると思います。

参考）kivy関連の記事

・kivyでピーク検出機能付き簡易FFTアナライザを作ってみた

　これまで習得したこれらwxPythonやkivy、Python標準であるTKinterを使わず、今回はPyQt5というライブラリを使ってGUIアプリをつくってみましたが、いくつかの理由があります。

Maplotlibの直感的な操作がやりたかった

　PyQt5を選択した理由は「Maplotlibの直感的な操作がやりたかった」からです。ChatGPTを駆使しながら、上記複数のライブラリでMatplotlibへのカーソル操作やドラッグ操作の追加を実装しようとしました。それが簡単にできたのがPyQt5だったということでこのライブラリで書いていこうと決めました。

　Matplotlibへのカーソル操作とドラッグ操作というのは以下の動画のことです。こんなことがやりたかった！

　本当はkivyでやりたかったのですが、自分の腕ではChatGPT o1を駆使しても満足のいく結果は得られませんでした（他のつよつよエンジニアならできそう）。

　本記事は1500行くらいのコードを紹介します。そのためそのすべてを詳細に紹介すると~~筆者が~~大変なので、アプリの使い方を抜粋して紹介する形式とします。

動作環境

　ここで紹介するコードは以下の環境で動作確認をしました。まだ未確認ですがmacOSにしか対応していない可能性が濃厚で、後でWindows版に対応したものを載せようと思います。

PC	M3 Macbook Air, RAM:16GB, macOS:Sanoma14.5
Python	3.12.4
numpy	1.26.4
scipy	1.14.0
matplotlib	3.10.0
librosa	0.10.2.post1
torchaudio	2.5.1
sounddevice	0.5.1
PyQt5	5.15.11
pyaudio	0.2.14

録音機能付きGUIアプリの概要

全コード

　まずは以下のコードブロックに全コードを示します。ただしコードが長いので、すぐに読まない人は「∧」マークで折りたたむと次のコンテンツにスクロールせずにいけます。

Wav Analyzer Toolsの全コード

import sys
import queue
from functools import partial

import numpy as np
import librosa
import torchaudio
import sounddevice as sd
import pyaudio
import wave
from scipy import signal, fftpack

from PyQt5.QtWidgets import (
    QApplication,
    QMainWindow,
    QVBoxLayout,
    QHBoxLayout,
    QPushButton,
    QFileDialog,
    QWidget,
    QDoubleSpinBox,
    QLabel,
    QDialog,
    QFormLayout,
    QLineEdit,
    QComboBox,
    QMessageBox,
)
from PyQt5.QtCore import QTimer, pyqtSignal, Qt, QThread
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.patches import Rectangle
import matplotlib.pyplot as plt

# ====== フォントと目盛の設定 ======
plt.rcParams['font.size'] = 14
plt.rcParams['font.family'] = 'Times New Roman'
plt.rcParams['xtick.direction'] = 'in'
plt.rcParams['ytick.direction'] = 'in'
# ====== 設定終了 ======


class AxisSettingsDialog(QDialog):
    """ダイアログウィンドウで軸の設定を行うクラス"""

    def __init__(self, parent, current_limits, title, additional_fields=None):
        super().__init__(parent)
        self.setWindowTitle(title)
        self.setModal(True)
        self.layout = QFormLayout(self)

        self.fields = {}
        self.additional_fields = additional_fields if additional_fields else []

        # Standard fields
        for key in ['Xmin', 'Xmax', 'Ymin', 'Ymax']:
            line_edit = QLineEdit(str(current_limits.get(key, 0)))
            self.fields[key] = line_edit
            self.layout.addRow(f"{key}:", line_edit)

        # Additional fields (e.g., Zmin, Zmax)
        for key in self.additional_fields:
            line_edit = QLineEdit(str(current_limits.get(key, 0)))
            self.fields[key] = line_edit
            self.layout.addRow(f"{key}:", line_edit)

        # Dialog buttons
        self.button_box = QHBoxLayout()
        self.ok_button = QPushButton("OK")
        self.ok_button.clicked.connect(self.accept)
        self.cancel_button = QPushButton("Cancel")
        self.cancel_button.clicked.connect(self.reject)
        self.button_box.addWidget(self.ok_button)
        self.button_box.addWidget(self.cancel_button)
        self.layout.addRow(self.button_box)

    def get_values(self):
        """ユーザーが入力した値を取得する"""
        
        values = {}
        for key, widget in self.fields.items():
            try:
                values[key] = float(widget.text())
            except ValueError:
                QMessageBox.warning(self, "Input Error", f"Invalid input for {key}.")
                return None
        return values


class MatplotlibWidget(QWidget):
    """Matplotlibを統合したカスタムウィジェットクラス"""

    def __init__(self, parent=None, main_window=None, single_plot=False):
        """
        Parameters:
            parent: Parent widget.
            main_window: Reference to the main window.
            single_plot: If True, only spectrogram is displayed with adjusted layout.
        """
        
        super().__init__(parent)
        self.main_window = main_window
        self.is_recording = False  # 録音中かどうかのフラグ

        self.layout = QVBoxLayout(self)
        if single_plot:
            # 単一の大きなスペクトログラム表示用レイアウト
            self.figure = plt.figure(figsize=(10, 6))
            self.ax_spec = self.figure.add_subplot(111)
            self.ax_wave = None
            self.ax_fft = None
        else:
            # 通常のレイアウト
            self.figure = plt.figure(constrained_layout=True, figsize=(10, 6))  # 幅を広げてFFT波形用スペースを確保
            self.gs = self.figure.add_gridspec(
                2, 2,
                height_ratios=[1, 1.2],
                width_ratios=[1, 0.6]  # FFT波形用に幅を増やす
            )

            # 左カラム: Time波形(上) と スペクトログラム(下)
            self.ax_wave = self.figure.add_subplot(self.gs[0, 0])
            self.ax_spec = self.figure.add_subplot(self.gs[1, 0])

            # 右カラム: FFT波形
            self.ax_fft = self.figure.add_subplot(self.gs[:, 1])

        self.canvas = FigureCanvas(self.figure)
        self.layout.addWidget(self.canvas)

        # カラーバー
        self.cbar = None

        # スペクトログラムの内部データ
        self.time_axis = None          # 波形表示用 (秒)
        self.fft_array = None          # スペクトログラム用 [周波数×時間]
        self.spec_time_axis = None     # スペクトログラムの時間軸 (秒)
        self.freq_axis = None          # スペクトログラムの周波数軸 (Hz)

        # スペクトログラムのイメージオブジェクトの保存
        self.spec_im = None

        # スペクトログラムのカラー範囲設定
        self.user_spec_zmin = None
        self.user_spec_zmax = None
        self.spec_color_manually_set = False  # スペクトログラムのカラー範囲が手動で設定されたかのフラグ

        # FFT波形の軸設定
        self.user_fft_x_limits = None
        self.user_fft_y_limits = None
        self.fft_axes_manually_set = False  # FFT軸が手動で設定されたかのフラグ

        # FFT Y軸の固定範囲
        self.fft_y_limits = (None, None)

        # デフォルト軸設定の保存
        self.default_limits = {
            'ax_wave': None,
            'ax_spec': None,
            'ax_fft': None,
            'spec_zmin': None,
            'spec_zmax': None
        }

        # イベント登録
        self.canvas.mpl_connect("button_press_event", self.on_mouse_press)
        self.canvas.mpl_connect("motion_notify_event", self.on_mouse_drag)
        self.canvas.mpl_connect("button_release_event", self.on_mouse_release)

        # 矩形選択用
        self.is_selecting_area = False
        self.rect_start = None
        self.selection_rect = None  # 現在の四角形オブジェクト
        self.rect_edge_color = 'yellow'
        self.rect_linewidth = 1.5

        # 波形カーソルドラッグ中フラグ
        self.is_dragging_wave_start = False
        self.is_dragging_wave_stop = False

        # カーソル線
        self.cursor_line_wave_start = None
        self.cursor_line_wave_stop = None
        self.cursor_line_spec_start = None
        self.cursor_line_spec_stop = None

        # 録音時用のスペクトログラム更新用インデックス
        self.record_spec_current_index = 0
        self.record_spec_total_segments = 0

    def _hide_patch(self, patch):
        """
        パッチをset_visible(False)にして画面から消す。
        リストから物理的に削除せずに済むので、ArtistListが再代入できない問題を回避できる。
        """
        
        if patch is not None:
            patch.set_visible(False)

    # ================= 波形表示 =================
    def plot_waveform(self, waveform, sample_rate):
        """波形をプロットする。"""
        if self.ax_wave is None:
            return  # 単一プロットモードでは波形を表示しない

        self.time_axis = np.linspace(0, len(waveform) / sample_rate, len(waveform))

        self.ax_wave.clear()
        self.ax_wave.plot(self.time_axis, waveform, linewidth=1.0)
        self.ax_wave.set_xlabel("Time [s]")
        self.ax_wave.set_ylabel("Amplitude")
        
        # 目盛を両側に設定
        self.ax_wave.yaxis.set_ticks_position('both')
        self.ax_wave.xaxis.set_ticks_position('both')

        if len(self.time_axis) > 0:
            self.ax_wave.set_xlim(0, self.time_axis[-1])
        else:
            self.ax_wave.set_xlim(0, 1)

        # カーソル線(開始)
        if self.cursor_line_wave_start is None:
            self.cursor_line_wave_start = self.ax_wave.axvline(0.0, color='red', linestyle='--')
        else:
            self.cursor_line_wave_start.set_xdata([0.0])

        # カーソル線(停止)
        if self.cursor_line_wave_stop is None:
            stop_x = self.time_axis[-1] if len(self.time_axis) else 0.0
            self.cursor_line_wave_stop = self.ax_wave.axvline(stop_x, color='blue', linestyle='--')
        else:
            stop_x = self.time_axis[-1] if len(self.time_axis) else 0.0
            self.cursor_line_wave_stop.set_xdata([stop_x])

        self.canvas.draw_idle()

    # ================= スペクトログラム表示 =================
    def plot_spectrogram(self, waveform, sample_rate, overlap=0, max_chart_time=None, is_recording=False):
        """
        スペクトログラムをプロットする。

        Parameters:
            waveform: numpy array of audio data.
            sample_rate: Sampling rate of the audio.
            overlap: Overlap percentage between segments.
            max_chart_time: Maximum time (in seconds) to display on the spectrogram.
                            If None, display full spectrogram.
            is_recording: Boolean flag indicating if it's recording mode.
        """
        
        # FFTサイズ
        Fs = 1024
        dBref = 2e-5
        Ts = len(waveform) / sample_rate
        Fc = Fs / sample_rate
        x_ol = Fs * (1 - (overlap / 100))
        N_ave = int((Ts - (Fc * (overlap / 100))) / (Fc * (1 - (overlap / 100))))
        N_ave = max(N_ave, 1)  # N_aveが0になるのを防ぐ

        segments = []
        for i in range(N_ave):
            start_point = int(x_ol * i)
            end_point = start_point + Fs
            if end_point <= len(waveform):
                segments.append(waveform[start_point:end_point])
            else:
                break

        han = signal.windows.hann(Fs)
        acf = 1 / (sum(han) / Fs)
        segments = [seg * han for seg in segments]

        fft_array = []
        for seg in segments:
            fft_seg = acf * np.abs(fftpack.fft(seg)[:Fs // 2] / (Fs / 2))
            fft_seg_db = self.db(fft_seg, dBref)
            fft_array.append(fft_seg_db)

        fft_array = np.array(fft_array).T

        if is_recording:
            # 録音時用のスペクトログラム初期化
            self.record_spec_total_segments = int(max_chart_time / (Fs / sample_rate))
            self.fft_array = np.zeros((Fs // 2, self.record_spec_total_segments))
            self.spec_time_axis = np.linspace(0, max_chart_time, self.record_spec_total_segments)
            self.record_spec_current_index = 0
        else:
            # 通常モードではデータをそのまま使用
            self.fft_array = fft_array
            if fft_array.shape[1] > 0:
                self.spec_time_axis = np.linspace(
                    0,
                    N_ave * (1 - overlap / 100) * Fs / sample_rate,
                    fft_array.shape[1]
                )
            else:
                self.spec_time_axis = np.array([])

        self.freq_axis = np.linspace(0, sample_rate / 2, Fs // 2)

        # 既存の colorbar を remove
        if self.cbar is not None:
            try:
                self.cbar.remove()
            except Exception as e:
                print("[Warning] cbar.remove() failed:", e)
            self.cbar = None

        self.ax_spec.clear()

        if self.fft_array.size > 0:
            im = self.ax_spec.imshow(
                self.fft_array,
                vmin=np.min(self.fft_array),
                vmax=np.max(self.fft_array),
                extent=[self.spec_time_axis[0], self.spec_time_axis[-1],
                        self.freq_axis[0], self.freq_axis[-1]],
                aspect='auto',
                cmap='jet',
                origin='lower'
            )
            self.spec_im = im  # スペクトログラムのイメージオブジェクトを保存
            if len(self.spec_time_axis) >= 2:
                self.spec_im.set_extent([self.spec_time_axis[0], self.spec_time_axis[-1],
                                         self.freq_axis[0], self.freq_axis[-1]])
                self.ax_spec.set_xlim(self.spec_time_axis[0], self.spec_time_axis[-1])
            else:
                # spec_time_axisに十分なデータがない場合、スペクトログラムの範囲をデフォルトに設定
                self.spec_im.set_extent([0, 5, self.freq_axis[0], self.freq_axis[-1]])
                self.ax_spec.set_xlim(0, 5)
            self.ax_spec.set_ylim(self.freq_axis[0], self.freq_axis[-1])

            try:
                self.cbar = self.figure.colorbar(im, ax=self.ax_spec, orientation='vertical', pad=0.02)
                self.cbar.set_label("Amplitude [dB]")
            except Exception as e:
                print("[Warning] colorbar creation failed:", e)
                self.cbar = None
        else:
            # データが存在しない場合の設定
            if max_chart_time:
                self.ax_spec.set_xlim(0, max_chart_time)
                self.ax_spec.set_ylim(0, 1)
            else:
                self.ax_spec.set_xlim(0, 1)
                self.ax_spec.set_ylim(0, 1)
            self.spec_im = None

        self.ax_spec.set_xlabel("Time [s]")
        self.ax_spec.set_ylabel("Frequency [Hz]")

        # 古い selection_rect があれば非表示に
        if self.selection_rect is not None:
            self._hide_patch(self.selection_rect)
        self.selection_rect = None
        self.is_selecting_area = False

        # スペクトログラムカーソル再作成
        self.cursor_line_spec_start = self.ax_spec.axvline(0.0, color='red', linestyle='--')
        if max_chart_time:
            stop_x = max_chart_time
        else:
            stop_x = self.spec_time_axis[-1] if len(self.spec_time_axis) > 0 else 0.0
        self.cursor_line_spec_stop = self.ax_spec.axvline(stop_x, color='blue', linestyle='--')

        # FFT波形のY軸をスペクトログラムのMax dBに固定
        if self.fft_array.size > 0:
            self.fft_y_limits = (np.min(self.fft_array), np.max(self.fft_array))
            # スペクトログラムのカラー範囲が手動で設定されていない場合のみデフォルト設定を適用
            if not self.spec_color_manually_set:
                if self.ax_fft:
                    self.ax_fft.set_xlim(0, sample_rate / 2)
                    if self.fft_y_limits[0] is not None and self.fft_y_limits[1] is not None:
                        self.ax_fft.set_ylim(self.fft_y_limits)
                # スペクトログラムのデフォルトZmin, Zmaxを保存
                self.default_limits['spec_zmin'] = np.min(self.fft_array)
                self.default_limits['spec_zmax'] = np.max(self.fft_array)
            else:
                # スペクトログラムのカラー範囲が手動で設定されている場合はFFT軸には影響しない
                if self.spec_im is not None and self.user_spec_zmin is not None and self.user_spec_zmax is not None:
                    self.spec_im.set_clim(self.user_spec_zmin, self.user_spec_zmax)
                    if self.cbar:
                        self.cbar.update_normal(self.spec_im)
        else:
            self.fft_y_limits = (0, 1)
            if not self.spec_color_manually_set:
                if self.ax_fft:
                    self.ax_fft.set_xlim(0, sample_rate / 2)
                    self.ax_fft.set_ylim(0, 1)
                self.default_limits['spec_zmin'] = 0
                self.default_limits['spec_zmax'] = 1
            else:
                if self.spec_im is not None and self.user_spec_zmin is not None and self.user_spec_zmax is not None:
                    self.spec_im.set_clim(self.user_spec_zmin, self.user_spec_zmax)
                    if self.cbar:
                        self.cbar.update_normal(self.spec_im)

        # FFT波形のグリッド表示を削除
        if self.ax_fft:
            self.ax_fft.clear()
            self.ax_fft.set_title("FFT")
            self.ax_fft.set_xlabel("Frequency [Hz]")
            self.ax_fft.set_ylabel("Amplitude [dB]")
            # 目盛を両側に設定
            self.ax_fft.yaxis.set_ticks_position('both')
            self.ax_fft.xaxis.set_ticks_position('both')

        self.canvas.draw_idle()

    @staticmethod
    def db(x, dBref):
        """dB変換を行う。"""
        return 20 * np.log10(np.maximum(x, dBref) / dBref)

    @staticmethod
    def compute_overall_level_db(db_matrix):
        """全体のレベルをdBで計算する。"""
        if db_matrix.shape[0] <= 1:
            return None
        db_matrix_no0 = db_matrix[1:, :]
        linear_vals = np.power(10.0, db_matrix_no0 / 10.0)
        total_power = np.sum(linear_vals)
        if total_power <= 0:
            return None
        return 10.0 * np.log10(total_power)

    # ============ 四角形ドラッグ選択 ============
    def start_rectangle_selection(self, xdata, ydata):
        """四角形のドラッグ開始。"""
        if self.selection_rect is not None:
            self._hide_patch(self.selection_rect)
        self.selection_rect = None

        self.is_selecting_area = True
        self.rect_start = (xdata, ydata)

        new_rect = Rectangle(
            (xdata, ydata), 0, 0,
            edgecolor=self.rect_edge_color,
            facecolor='none',
            linewidth=self.rect_linewidth
        )
        self.ax_spec.add_patch(new_rect)
        self.selection_rect = new_rect

    def update_rectangle_selection(self, xdata, ydata):
        """ドラッグ中: 四角形のサイズを更新"""
        
        if not self.is_selecting_area or self.selection_rect is None:
            return
        x0, y0 = self.rect_start
        self.selection_rect.set_x(min(x0, xdata))
        self.selection_rect.set_y(min(y0, ydata))
        self.selection_rect.set_width(abs(xdata - x0))
        self.selection_rect.set_height(abs(ydata - y0))
        self.canvas.draw_idle()

    def finish_rectangle_selection(self, xdata, ydata):
        """ドラッグ終了: 四角形を画面に残したままfinish"""
        
        if not self.is_selecting_area or self.selection_rect is None:
            return
        self.is_selecting_area = False

        x0, y0 = self.rect_start
        x1, y1 = xdata, ydata

        if self.spec_time_axis is None or len(self.spec_time_axis) == 0:
            return
        if self.freq_axis is None or len(self.freq_axis) == 0:
            return

        tmin, tmax = sorted([x0, x1])
        fmin, fmax = sorted([y0, y1])

        tmin = max(tmin, self.spec_time_axis[0]) if len(self.spec_time_axis) > 0 else 0
        tmax = min(tmax, self.spec_time_axis[-1]) if len(self.spec_time_axis) > 0 else 0
        fmin = max(fmin, self.freq_axis[0]) if len(self.freq_axis) > 0 else 0
        fmax = min(fmax, self.freq_axis[-1]) if len(self.freq_axis) > 0 else 0

        if tmin >= tmax or fmin >= fmax:
            return

        time_indices = np.where(
            (self.spec_time_axis >= tmin) & (self.spec_time_axis <= tmax)
        )[0]
        freq_indices = np.where(
            (self.freq_axis >= fmin) & (self.freq_axis <= fmax)
        )[0]

        if len(time_indices) == 0 or len(freq_indices) == 0:
            return

        selected_region = self.fft_array[np.ix_(freq_indices, time_indices)]
        max_val = np.max(selected_region)
        freq_idx_in_sel, _ = np.unravel_index(
            np.argmax(selected_region), selected_region.shape
        )
        actual_freq_idx = freq_indices[freq_idx_in_sel]
        max_freq = self.freq_axis[actual_freq_idx]

        freq_indices_no0 = freq_indices[freq_indices > 0]
        if len(freq_indices_no0) == 0:
            partial_oa_db = None
        else:
            partial_region_no0 = selected_region[
                freq_indices_no0 - freq_indices_no0.min(), :
            ]
            partial_oa_db = self.compute_overall_level_db(partial_region_no0)

        # 計算結果を MainWindow に通知
        if self.main_window:
            self.main_window.update_max_info(max_val, max_freq, is_whole=False)
            self.main_window.update_oa_info(partial_oa_db, is_whole=False)

        self.canvas.draw_idle()

    # ============ マウスイベント ============
    def on_mouse_press(self, event):
        """マウスクリックイベントのハンドラー"""
        
        if event.dblclick:
            # ダブルクリック時に軸設定ダイアログを開く
            self.open_axis_settings_dialog(event)
            return

        if self.ax_wave and event.inaxes == self.ax_wave and event.button == 1:
            xdata = event.xdata
            if xdata is None:
                return
            if self.cursor_line_wave_start is None or self.cursor_line_wave_stop is None:
                return
            start_pos = self.cursor_line_wave_start.get_xdata()[0]
            stop_pos = self.cursor_line_wave_stop.get_xdata()[0]
            if abs(xdata - start_pos) < abs(xdata - stop_pos):
                self.is_dragging_wave_start = True
            else:
                self.is_dragging_wave_stop = True

        elif self.ax_spec and event.inaxes == self.ax_spec and event.button == 1:
            # 新しい四角形ドラッグ開始
            if event.xdata is not None and event.ydata is not None:
                self.start_rectangle_selection(event.xdata, event.ydata)

    def on_mouse_drag(self, event):
        """マウスドラッグイベントのハンドラー"""
        
        if self.is_dragging_wave_start and self.ax_wave and event.inaxes == self.ax_wave:
            if event.xdata is not None:
                self.cursor_line_wave_start.set_xdata([event.xdata])
                self._adjust_stop_cursor_if_needed()
                if self.cursor_line_spec_start:
                    self.cursor_line_spec_start.set_xdata([event.xdata])

                # --- FFTスライス更新 ---
                self.update_fft_slice(event.xdata)

                self.canvas.draw_idle()

        elif self.is_dragging_wave_stop and self.ax_wave and event.inaxes == self.ax_wave:
            if event.xdata is not None:
                self.cursor_line_wave_stop.set_xdata([event.xdata])
                self._adjust_stop_cursor_if_needed()
                if self.cursor_line_spec_stop:
                    self.cursor_line_spec_stop.set_xdata([event.xdata])
                # stopカーソル移動時は FFTスライス更新しない(赤カーソルのみ更新)
                self.canvas.draw_idle()

        elif self.is_selecting_area and self.ax_spec and event.inaxes == self.ax_spec:
            if event.xdata is not None and event.ydata is not None:
                self.update_rectangle_selection(event.xdata, event.ydata)

    def on_mouse_release(self, event):
        """マウスリリースイベントのハンドラー"""
        
        if self.is_dragging_wave_start or self.is_dragging_wave_stop:
            self.is_dragging_wave_start = False
            self.is_dragging_wave_stop = False
            return

        if self.is_selecting_area and self.ax_spec and event.inaxes == self.ax_spec:
            if event.xdata is not None and event.ydata is not None:
                self.finish_rectangle_selection(event.xdata, event.ydata)

    def _adjust_stop_cursor_if_needed(self):
        """ストップカーソルの位置を調整する"""
        
        if self.ax_wave is None:
            return
        start_wave = self.cursor_line_wave_start.get_xdata()[0]
        stop_wave = self.cursor_line_wave_stop.get_xdata()[0]
        if stop_wave < start_wave:
            forced_stop = start_wave + 0.001
            if self.time_axis is not None and len(self.time_axis) > 0:
                if forced_stop > self.time_axis[-1]:
                    forced_stop = self.time_axis[-1]
            self.cursor_line_wave_stop.set_xdata([forced_stop])

    # ============ 追加: FFT波形表示 ============
    def update_fft_slice(self, time_sec):
        """
        与えられた time_sec における FFT振幅スペクトルを ax_fft に描画。

        Parameters:
            time_sec: 時間（秒）。
        """
        
        if self.fft_array is None or self.fft_array.size == 0:
            return
        if self.spec_time_axis is None or len(self.spec_time_axis) == 0:
            return
        if self.freq_axis is None or len(self.freq_axis) == 0:
            return

        # time_sec が spec_time_axis の範囲外の場合はクリップ
        if time_sec < self.spec_time_axis[0]:
            time_sec = self.spec_time_axis[0]
        if time_sec > self.spec_time_axis[-1]:
            time_sec = self.spec_time_axis[-1]

        # 指定した time_sec に一番近いスペクトログラム上の列インデックスを探す
        idx = np.searchsorted(self.spec_time_axis, time_sec)
        idx = max(0, min(idx, self.fft_array.shape[1] - 1))

        # 周波数軸 vs振幅[dB] ( = self.fft_array[:, idx] ) を取得
        slice_db = self.fft_array[:, idx]

        if self.ax_fft is None:
            return  # 単一プロットモードではFFTを表示しない

        # ax_fft 再描画
        self.ax_fft.clear()
        self.ax_fft.plot(self.freq_axis, slice_db, color='magenta', linewidth=1.0)
        self.ax_fft.set_title(f"FFT @ t={time_sec:.3f}s")
        self.ax_fft.set_xlabel("Frequency [Hz]")
        self.ax_fft.set_ylabel("Amplitude [dB]")
        
        # 目盛を両側に設定
        self.ax_fft.yaxis.set_ticks_position('both')
        self.ax_fft.xaxis.set_ticks_position('both')

        # FFTの軸を固定スケールに設定
        if not self.fft_axes_manually_set:
            if self.main_window and self.main_window.current_sr:
                self.ax_fft.set_xlim(0, self.main_window.current_sr / 2)
            else:
                self.ax_fft.set_xlim(0, self.freq_axis[-1] if len(self.freq_axis) > 0 else 1)
            if self.fft_y_limits[0] is not None and self.fft_y_limits[1] is not None:
                self.ax_fft.set_ylim(self.fft_y_limits)
        # ユーザーが手動設定した場合は、設定された軸を維持
        else:
            if self.user_fft_x_limits and self.user_fft_y_limits:
                self.ax_fft.set_xlim(self.user_fft_x_limits)
                self.ax_fft.set_ylim(self.user_fft_y_limits)

        self.canvas.draw_idle()

    # ============ 軸設定ダイアログの表示 ============
    def open_axis_settings_dialog(self, event):
        """
        ダブルクリックされた軸に基づいて軸設定ダイアログを開く。

        Parameters:
            event: マウスイベントオブジェクト。
        """
        if self.ax_wave and event.inaxes == self.ax_wave:
            current_limits = {
                'Xmin': self.ax_wave.get_xlim()[0],
                'Xmax': self.ax_wave.get_xlim()[1],
                'Ymin': self.ax_wave.get_ylim()[0],
                'Ymax': self.ax_wave.get_ylim()[1],
            }
            title = "Set Time Waveform Axis Limits"
            dialog = AxisSettingsDialog(self, current_limits, title)
            if dialog.exec_() == QDialog.Accepted:
                new_limits = dialog.get_values()
                if new_limits:
                    self.ax_wave.set_xlim(new_limits['Xmin'], new_limits['Xmax'])
                    self.ax_wave.set_ylim(new_limits['Ymin'], new_limits['Ymax'])
                    self.canvas.draw_idle()
                    # 更新後のFFTも再描画
                    current_time = self.cursor_line_wave_start.get_xdata()[0]
                    self.update_fft_slice(current_time)

        elif self.ax_spec and event.inaxes == self.ax_spec:
            if self.spec_im is not None:
                current_limits = {
                    'Xmin': self.ax_spec.get_xlim()[0],
                    'Xmax': self.ax_spec.get_xlim()[1],
                    'Ymin': self.ax_spec.get_ylim()[0],
                    'Ymax': self.ax_spec.get_ylim()[1],
                    'Zmin': self.spec_im.get_clim()[0],
                    'Zmax': self.spec_im.get_clim()[1],
                }
            else:
                current_limits = {
                    'Xmin': 0,
                    'Xmax': 1,
                    'Ymin': 0,
                    'Ymax': 1,
                    'Zmin': 0,
                    'Zmax': 1,
                }
            title = "Set Spectrogram Axis Limits"
            dialog = AxisSettingsDialog(
                self,
                current_limits,
                title,
                additional_fields=['Zmin', 'Zmax']
            )
            if dialog.exec_() == QDialog.Accepted:
                new_limits = dialog.get_values()
                if new_limits:
                    self.ax_spec.set_xlim(new_limits['Xmin'], new_limits['Xmax'])
                    self.ax_spec.set_ylim(new_limits['Ymin'], new_limits['Ymax'])
                    # Set colorbar limits
                    if 'Zmin' in new_limits and 'Zmax' in new_limits:
                        if self.spec_im is not None:
                            self.spec_im.set_clim(new_limits['Zmin'], new_limits['Zmax'])
                            if self.cbar:
                                self.cbar.update_normal(self.spec_im)
                    self.canvas.draw_idle()
                    # Update user settings for spectrogram color range
                    self.spec_color_manually_set = True
                    self.user_spec_zmin = new_limits.get('Zmin', None)
                    self.user_spec_zmax = new_limits.get('Zmax', None)

        elif self.ax_fft and event.inaxes == self.ax_fft:
            current_limits = {
                'Xmin': self.ax_fft.get_xlim()[0],
                'Xmax': self.ax_fft.get_xlim()[1],
                'Ymin': self.ax_fft.get_ylim()[0],
                'Ymax': self.ax_fft.get_ylim()[1],
            }
            title = "Set FFT Axis Limits"
            dialog = AxisSettingsDialog(self, current_limits, title)
            if dialog.exec_() == QDialog.Accepted:
                new_limits = dialog.get_values()
                if new_limits:
                    self.ax_fft.set_xlim(new_limits['Xmin'], new_limits['Xmax'])
                    self.ax_fft.set_ylim(new_limits['Ymin'], new_limits['Ymax'])
                    self.canvas.draw_idle()
                    # 手動設定フラグを立てて、設定を保存
                    self.fft_axes_manually_set = True
                    self.user_fft_x_limits = (new_limits['Xmin'], new_limits['Xmax'])
                    self.user_fft_y_limits = (new_limits['Ymin'], new_limits['Ymax'])

    # ============ 軸設定のデフォルト保存とリセット ============
    def set_default_limits(self):
        """現在の軸設定をデフォルトとして保存"""
        
        if self.ax_wave:
            self.default_limits['ax_wave'] = self.ax_wave.get_xlim() + self.ax_wave.get_ylim()
        if self.ax_spec:
            self.default_limits['ax_spec'] = self.ax_spec.get_xlim() + self.ax_spec.get_ylim()
        if self.ax_fft:
            self.default_limits['ax_fft'] = self.ax_fft.get_xlim() + self.ax_fft.get_ylim()
        if self.spec_im is not None:
            self.default_limits['spec_zmin'] = self.spec_im.get_clim()[0]
            self.default_limits['spec_zmax'] = self.spec_im.get_clim()[1]
        else:
            self.default_limits['spec_zmin'] = 0
            self.default_limits['spec_zmax'] = 1

    def reset_axes(self):
        """Homeボタンで軸設定をデフォルトにリセット"""
        
        if self.ax_wave and self.default_limits['ax_wave']:
            self.ax_wave.set_xlim(
                self.default_limits['ax_wave'][0],
                self.default_limits['ax_wave'][1]
            )
            self.ax_wave.set_ylim(
                self.default_limits['ax_wave'][2],
                self.default_limits['ax_wave'][3]
            )
        if self.ax_spec and self.default_limits['ax_spec']:
            self.ax_spec.set_xlim(
                self.default_limits['ax_spec'][0],
                self.default_limits['ax_spec'][1]
            )
            self.ax_spec.set_ylim(
                self.default_limits['ax_spec'][2],
                self.default_limits['ax_spec'][3]
            )
            # スペクトログラムのカラー範囲をデフォルトにリセット
            if self.spec_im is not None and 'spec_zmin' in self.default_limits and 'spec_zmax' in self.default_limits:
                self.spec_im.set_clim(
                    self.default_limits['spec_zmin'],
                    self.default_limits['spec_zmax']
                )
                if self.cbar:
                    self.cbar.update_normal(self.spec_im)
        if self.ax_fft and self.default_limits['ax_fft']:
            self.ax_fft.set_xlim(
                self.default_limits['ax_fft'][0],
                self.default_limits['ax_fft'][1]
            )
            self.ax_fft.set_ylim(
                self.default_limits['ax_fft'][2],
                self.default_limits['ax_fft'][3]
            )

        # Reset user settings
        self.spec_color_manually_set = False
        self.user_spec_zmin = None
        self.user_spec_zmax = None
        self.fft_axes_manually_set = False
        self.user_fft_x_limits = None
        self.user_fft_y_limits = None

        self.canvas.draw_idle()


class RecordThread(QThread):
    """録音をバックグラウンドで行うスレッドクラス"""

    data_signal = pyqtSignal(np.ndarray)
    finished_signal = pyqtSignal(str)
    error_signal = pyqtSignal(str)

    def __init__(self, mic_index, samplerate, frames_per_buffer, record_seconds, filename):
        super().__init__()
        self.mic_index = mic_index
        self.samplerate = samplerate
        self.frames_per_buffer = frames_per_buffer
        self.record_seconds = record_seconds
        self.filename = filename
        self._is_running = True

    def run(self):
        """スレッドの実行部分"""
        
        try:
            pa = pyaudio.PyAudio()
            stream = pa.open(
                format=pyaudio.paInt16,
                channels=1,
                rate=self.samplerate,
                input=True,
                frames_per_buffer=self.frames_per_buffer,
                input_device_index=self.mic_index
            )
            frames = []
            total_frames = int(self.samplerate / self.frames_per_buffer * self.record_seconds)
            for _ in range(total_frames):
                if not self._is_running:
                    break
                data = stream.read(self.frames_per_buffer, exception_on_overflow=False)
                frames.append(data)
                # Convert bytes to numpy array
                audio_data = np.frombuffer(data, dtype=np.int16).astype(np.float32) / 32768.0
                self.data_signal.emit(audio_data)
            stream.stop_stream()
            stream.close()
            pa.terminate()

            # Save WAV file
            wf = wave.open(self.filename, 'wb')
            wf.setnchannels(1)
            wf.setsampwidth(pa.get_sample_size(pyaudio.paInt16))
            wf.setframerate(self.samplerate)
            wf.writeframes(b''.join(frames))
            wf.close()

            self.finished_signal.emit(self.filename)
        except Exception as e:
            self.error_signal.emit(str(e))

    def stop(self):
        """録音を停止する"""
        self._is_running = False


class RecordWindow(QDialog):
    """録音用のダイアログウィンドウクラス"""

    recording_finished = pyqtSignal(str)

    def __init__(self, parent=None):
        super().__init__(parent)
        self.setWindowTitle("Record WAV")
        self.setModal(True)
        self.layout = QVBoxLayout(self)

        self.form_layout = QFormLayout()
        self.layout.addLayout(self.form_layout)

        # 録音時間設定
        self.record_time_spin = QDoubleSpinBox(self)
        self.record_time_spin.setRange(1.0, 600.0)  # 最大10分
        self.record_time_spin.setValue(5.0)
        self.record_time_spin.setSingleStep(1.0)
        self.form_layout.addRow("Record Time [s]:", self.record_time_spin)

        # マイク選択プルダウン
        self.mic_combo = QComboBox(self)
        self.populate_microphones()
        self.form_layout.addRow("Select Microphone:", self.mic_combo)

        # WAVファイル名入力欄
        self.filename_layout = QHBoxLayout()
        self.filename_edit = QLineEdit(self)
        self.filename_edit.setPlaceholderText("recorded")
        self.filename_layout.addWidget(self.filename_edit)
        self.filename_layout.addWidget(QLabel(".wav"))
        self.form_layout.addRow("Filename:", self.filename_layout)

        # 録音ボタンとCloseボタン
        self.button_layout = QHBoxLayout()
        self.record_button = QPushButton("Record", self)
        self.record_button.clicked.connect(self.toggle_recording)
        self.button_layout.addWidget(self.record_button)

        self.close_button = QPushButton("Close", self)
        self.close_button.clicked.connect(self.close)
        self.button_layout.addWidget(self.close_button)

        self.layout.addLayout(self.button_layout)

        # スペクトログラム表示
        self.spec_widget = MatplotlibWidget(
            self, main_window=self.parent(), single_plot=True
        )  # single_plot=True でスペクトログラムのみ表示
        self.layout.addWidget(self.spec_widget)

        # 録音関連
        self.record_thread = None

    def populate_microphones(self):
        """マイクのリストを取得してプルダウンに追加する"""
        
        pa = pyaudio.PyAudio()
        self.mic_dict = {}  # 名前とインデックスのマッピング
        for i in range(pa.get_device_count()):
            info = pa.get_device_info_by_index(i)
            if info['maxInputChannels'] > 0:
                name = info['name']
                self.mic_combo.addItem(name)
                self.mic_dict[name] = i
        pa.terminate()

        if self.mic_combo.count() == 0:
            self.mic_combo.addItem("No Microphone Found")
            self.record_button.setEnabled(False)

    def toggle_recording(self):
        """録音ボタンのトグル動作を行う"""
        
        if not hasattr(self, 'is_recording') or not self.is_recording:
            # Start recording
            self.start_recording()
        else:
            # Stop recording
            self.stop_recording()

    def start_recording(self):
        """録音を開始する"""
        
        # 入力値の取得
        record_time = self.record_time_spin.value()
        selected_mic_name = self.mic_combo.currentText()
        mic_index = self.mic_dict.get(selected_mic_name, None)
        filename = self.filename_edit.text().strip()
        if not filename:
            filename = "recorded"
        if not filename.lower().endswith(".wav"):
            filename += ".wav"

        # マイクが選択されているか確認
        if mic_index is None:
            QMessageBox.warning(self, "Error", "No microphone selected.")
            return

        # 録音開始
        self.is_recording = True
        self.record_button.setText("Stop")
        self.record_button.setStyleSheet("background-color: red")
        self.close_button.setEnabled(False)
        self.record_time_spin.setEnabled(False)
        self.mic_combo.setEnabled(False)
        self.filename_edit.setEnabled(False)

        # Initialize spectrogram for recording
        self.spec_widget.is_recording = True
        self.spec_widget.plot_spectrogram(
            np.array([]), 44100, max_chart_time=record_time, is_recording=True
        )

        # Start recording thread
        self.record_thread = RecordThread(
            mic_index=mic_index,
            samplerate=44100,  # サンプリングレートを設定
            frames_per_buffer=1024,  # fft_size=1024に合わせる
            record_seconds=record_time,
            filename=filename
        )
        self.record_thread.data_signal.connect(self.update_spectrogram)
        self.record_thread.finished_signal.connect(self.on_recording_finished)
        self.record_thread.error_signal.connect(self.on_recording_error)
        self.record_thread.start()

    def stop_recording(self):
        """録音を停止する"""
        
        if self.record_thread and self.record_thread.isRunning():
            self.record_thread.stop()
            self.record_thread.wait()

    def update_spectrogram(self, audio_data):
        """リアルタイムスペクトログラムを更新する関数"""
        
        # fft_size=1024, overlap=0
        fft_size = 1024
        overlap = 0  # オーバーラップ率0
        max_chart_time = self.record_time_spin.value()  # 録音時間に基づく

        # FFT
        spectrum = fftpack.fft(audio_data, n=fft_size)
        amp = np.abs(spectrum[:fft_size // 2]) / (fft_size / 2)
        amp_db = self.spec_widget.db(amp, 2e-5)  # dBrefを統一

        # 録音時のみスペクトログラムを更新
        if self.spec_widget.is_recording:
            if (self.spec_widget.fft_array is not None and
                    self.spec_widget.record_spec_current_index < self.spec_widget.record_spec_total_segments):
                # データを既存のスペクトログラムに書き換える
                self.spec_widget.fft_array[:, self.spec_widget.record_spec_current_index] = amp_db
                self.spec_widget.record_spec_current_index += 1

                # カラーバーの最小値と最大値を計算
                current_min = np.min(self.spec_widget.fft_array)
                current_max = np.max(self.spec_widget.fft_array)

                # 既存のカラーバー範囲と比較
                if self.spec_widget.cbar:
                    current_clim = self.spec_widget.spec_im.get_clim()
                    if current_min < current_clim[0] or current_max > current_clim[1]:
                        # カラーバーの範囲を更新
                        self.spec_widget.spec_im.set_clim(current_min, current_max)
                        self.spec_widget.cbar.update_normal(self.spec_widget.spec_im)

                # スペクトログラムの再描画
                self.spec_widget.spec_im.set_data(self.spec_widget.fft_array)
                self.spec_widget.canvas.draw_idle()

    def on_recording_finished(self, file_path):
        """録音が終了した後に呼ばれるシグナルハンドラー"""
        
        print(f"Recording finished. File path: {file_path}")  # デバッグ用
        if file_path:
            # メインウィンドウでのみメッセージボックスを表示
            self.recording_finished.emit(file_path)
        else:
            QMessageBox.warning(self, "Recording Failed", "Recording was not successful.")
        # ボタンのリセット
        self.is_recording = False
        self.record_button.setText("Record")
        self.record_button.setStyleSheet("")
        self.close_button.setEnabled(True)
        self.record_time_spin.setEnabled(True)
        self.mic_combo.setEnabled(True)
        self.filename_edit.setEnabled(True)
        # 録音ウィンドウを自動的に閉じる
        self.close()

    def on_recording_error(self, error_message):
        """録音中にエラーが発生した場合に呼ばれるシグナルハンドラー"""
        
        QMessageBox.critical(self, "Recording Error", error_message)
        self.stop_recording()
        self.close()

    def closeEvent(self, event):
        """ウィンドウを閉じる際のイベントハンドラー"""
        
        if hasattr(self, 'is_recording') and self.is_recording:
            reply = QMessageBox.question(
                self,
                'Recording in progress',
                'Recording is still in progress. Do you want to stop and close?',
                QMessageBox.Yes | QMessageBox.No,
                QMessageBox.No
            )
            if reply == QMessageBox.Yes:
                self.stop_recording()
            else:
                event.ignore()
                return
        event.accept()


class MainWindow(QMainWindow):
    """アプリケーションのメインウィンドウクラス"""

    def __init__(self):
        super().__init__()
        self.setWindowTitle("Wav Analyzer Tools")
        self.main_widget = QWidget(self)
        self.setCentralWidget(self.main_widget)

        self.main_hbox = QHBoxLayout(self.main_widget)

        # 左側
        self.left_widget = QWidget(self)
        self.left_layout = QVBoxLayout(self.left_widget)
        self.main_hbox.addWidget(self.left_widget, stretch=1)

        self.matplotlib_widget = MatplotlibWidget(self, main_window=self)
        self.left_layout.addWidget(self.matplotlib_widget)

        self.button_layout = QHBoxLayout()
        self.left_layout.addLayout(self.button_layout)

        # Record WAVボタン
        self.record_button = QPushButton("Record WAV", self)
        self.record_button.clicked.connect(self.open_record_window)
        self.button_layout.addWidget(self.record_button)

        # Load WAVボタン
        self.load_button = QPushButton("Load WAV", self)
        self.load_button.clicked.connect(lambda: self.load_wav())
        self.button_layout.addWidget(self.load_button)

        # Playボタン
        self.play_button = QPushButton("Play", self)
        self.play_button.clicked.connect(self.play_audio)
        self.button_layout.addWidget(self.play_button)

        # Stopボタン
        self.stop_button = QPushButton("Stop", self)
        self.stop_button.clicked.connect(self.stop_playback)  # Stopボタンのクリックイベントにstop_playbackを接続
        self.button_layout.addWidget(self.stop_button)

        # 再生速度変更用変数
        self.speed_label = QLabel("Speed Factor", self)
        self.button_layout.addWidget(self.speed_label)

        self.speed_box = QDoubleSpinBox(self)
        self.speed_box.setRange(0.1, 10.0)    # 再生速度は10倍までにしておく
        self.speed_box.setValue(1.0)
        self.speed_box.setSingleStep(0.1)
        self.button_layout.addWidget(self.speed_box)

        # Homeボタン
        self.home_button = QPushButton("Home", self)
        self.home_button.clicked.connect(self.reset_axes)
        self.left_layout.addWidget(self.home_button)

        # 右側 (Max値, O.A. 表示)
        self.info_widget = QWidget(self)
        self.info_layout = QVBoxLayout(self.info_widget)
        self.main_hbox.addWidget(self.info_widget, stretch=0)

        self.label_title = QLabel("<b>Spectrogram Info</b>", self)
        self.info_layout.addWidget(self.label_title)

        self.label_max_value = QLabel("Max Value[dB]: ---", self)
        self.info_layout.addWidget(self.label_max_value)

        self.label_max_freq = QLabel("Max Freq[Hz]: ---", self)
        self.info_layout.addWidget(self.label_max_freq)

        self.label_oa_value = QLabel("O.A.[dB]: ---", self)
        self.info_layout.addWidget(self.label_oa_value)

        self.info_layout.addStretch(1)

        self.current_waveform = None
        self.current_sr = None

        self._waveform_for_playback = None
        self._current_index = 0
        self._wave_len_no_silence = 1
        self._start_pos_for_playback = 0.0
        self._stop_pos_for_playback = 0.0

        self.sd_stream = None
        self.position_queue = queue.Queue()

        self.update_timer = QTimer()
        self.update_timer.setInterval(20)
        self.update_timer.timeout.connect(self.poll_queue_and_update_cursor)
        self.update_timer.start()

    def open_record_window(self):
        """Record WAVボタンがクリックされたときに録音ウィンドウを開く"""
        
        self.record_window = RecordWindow(self)
        self.record_window.recording_finished.connect(self.on_recording_finished)
        self.record_window.show()

    def on_recording_finished(self, file_path):
        """録音が終了した後に呼ばれるシグナルハンドラー"""
        
        print(f"Recording finished. File path: {file_path}")  # デバッグ用
        if file_path:
            # メインウィンドウでのみメッセージボックスを表示
            QMessageBox.information(
                self,
                "Recording Finished",
                f"Recording saved to {file_path}"
            )
            self.load_wav(file_path)
        else:
            QMessageBox.warning(self, "Recording Failed", "Recording was not successful.")

    def load_wav(self, file_path=None):
        """WAVファイルを読み込み、表示する"""
        
        print("load_wav called")  # デバッグ用
        self.stop_playback()
        if file_path is None:
            file_path, _ = QFileDialog.getOpenFileName(
                self, "Select WAV File", "", "WAV Files (*.wav)"
            )
            if not file_path:
                return

        if not isinstance(file_path, str):
            print(f"Invalid file path: {file_path}")
            return

        try:
            waveform, sr = torchaudio.load(file_path)
        except Exception as e:
            print(f"Error loading WAV file: {e}")
            QMessageBox.critical(
                self,
                "Error",
                f"Could not load WAV file:\n{e}"
            )
            return

        if waveform.shape[0] > 1:
            waveform = waveform.mean(dim=0, keepdim=True)
        self.current_waveform = waveform.numpy()[0].copy()
        self.current_sr = sr

        # カーソルリセット
        self.matplotlib_widget.cursor_line_wave_start = None
        self.matplotlib_widget.cursor_line_wave_stop = None
        self.matplotlib_widget.cursor_line_spec_start = None
        self.matplotlib_widget.cursor_line_spec_stop = None

        # 波形/スペクトログラム更新
        try:
            self.matplotlib_widget.plot_waveform(self.current_waveform, self.current_sr)
            self.matplotlib_widget.plot_spectrogram(
                self.current_waveform,
                self.current_sr,
                max_chart_time=None,
                is_recording=False
            )  # 全スペクトログラム表示
            self.update_whole_spectrogram_info()
            self.update_whole_oa_info()
        except Exception as e:
            print(f"Error in plotting: {e}")
            QMessageBox.critical(
                self,
                "Error",
                f"Error in plotting:\n{e}"
            )
            return

        # 0[s] にカーソルを合わせて、FFTスライスを更新
        if self.matplotlib_widget.cursor_line_wave_start:
            self.matplotlib_widget.cursor_line_wave_start.set_xdata([0.0])
        if self.matplotlib_widget.cursor_line_spec_start:
            self.matplotlib_widget.cursor_line_spec_start.set_xdata([0.0])
        self.matplotlib_widget.update_fft_slice(0.0)
        self.matplotlib_widget.canvas.draw_idle()

        # デフォルト軸設定を保存
        self.matplotlib_widget.set_default_limits()

        # スペクトログラムのカラー範囲設定が手動で設定されていない場合、フラグをリセット
        self.matplotlib_widget.spec_color_manually_set = False
        self.matplotlib_widget.user_spec_zmin = None
        self.matplotlib_widget.user_spec_zmax = None

        # FFT軸設定が手動で設定されていない場合、フラグをリセット
        self.matplotlib_widget.fft_axes_manually_set = False
        self.matplotlib_widget.user_fft_x_limits = None
        self.matplotlib_widget.user_fft_y_limits = None

    def update_whole_spectrogram_info(self):
        """スペクトログラム全体の情報を更新する"""
        
        fft_array = self.matplotlib_widget.fft_array
        freq_axis = self.matplotlib_widget.freq_axis
        if fft_array is None or fft_array.size == 0:
            self.label_max_value.setText("Max Value[dB]: ---")
            self.label_max_freq.setText("Max Freq[Hz]: ---")
            return

        max_val = np.max(fft_array)
        freq_idx, _ = np.unravel_index(np.argmax(fft_array), fft_array.shape)
        if freq_idx < len(freq_axis):
            max_freq = freq_axis[freq_idx]
        else:
            max_freq = 0.0
        self.update_max_info(max_val, max_freq, is_whole=True)

    def update_whole_oa_info(self):
        """スペクトログラム全体のO.A.情報を更新する"""
        
        fft_array = self.matplotlib_widget.fft_array
        if fft_array is None or fft_array.size == 0:
            self.update_oa_info(None, is_whole=True)
            return
        oa_db = MatplotlibWidget.compute_overall_level_db(fft_array)
        self.update_oa_info(oa_db, is_whole=True)

    def update_max_info(self, max_val_db, max_freq, is_whole=False):
        """最大値情報を更新する"""
        
        if is_whole:
            self.label_max_value.setText(f"Max Value[dB]: {max_val_db:.2f} (whole)")
            self.label_max_freq.setText(f"Max Freq[Hz]: {max_freq:.2f} (whole)")
        else:
            self.label_max_value.setText(f"Max Value [dB]: {max_val_db:.2f} (sel)")
            self.label_max_freq.setText(f"Max Freq[Hz]: {max_freq:.2f} (sel)")

    def update_oa_info(self, oa_db, is_whole=False):
        """O.A.情報を更新する"""
        
        if oa_db is None:
            if is_whole:
                self.label_oa_value.setText("O.A.[dB]: --- (whole)")
            else:
                self.label_oa_value.setText("O.A.[dB]: --- (sel)")
        else:
            if is_whole:
                self.label_oa_value.setText(f"O.A.[dB]: {oa_db:.2f} (whole)")
            else:
                self.label_oa_value.setText(f"O.A.[dB]: {oa_db:.2f} (sel)")

    def play_audio(self):
        """選択された範囲のオーディオを再生する"""
        
        if self.current_waveform is None or self.current_sr is None:
            print("No audio loaded.")
            QMessageBox.warning(self, "Warning", "No audio loaded.")
            return

        self.stop_playback()

        if (self.matplotlib_widget.cursor_line_wave_start is None or
                self.matplotlib_widget.cursor_line_wave_stop is None):
            print("Cursor lines are not set.")
            QMessageBox.warning(self, "Warning", "Cursor lines are not set.")
            return

        start_pos = self.matplotlib_widget.cursor_line_wave_start.get_xdata()[0]
        stop_pos = self.matplotlib_widget.cursor_line_wave_stop.get_xdata()[0]
        if stop_pos <= start_pos:
            print("停止位置が開始位置より前です。再生できません。")
            QMessageBox.warning(
                self,
                "Warning",
                "Stop position is before start position. Cannot play."
            )
            return

        self._start_pos_for_playback = start_pos
        self._stop_pos_for_playback = stop_pos

        start_idx = int(start_pos * self.current_sr)
        stop_idx = int(stop_pos * self.current_sr)
        start_idx = max(0, min(start_idx, len(self.current_waveform)))
        stop_idx = max(0, min(stop_idx, len(self.current_waveform)))
        if stop_idx <= start_idx:
            print("再生範囲が0です。再生できません。")
            QMessageBox.warning(
                self,
                "Warning",
                "Playback range is zero. Cannot play."
            )
            return

        sub_wave = self.current_waveform[start_idx:stop_idx].astype(np.float32)

        speed_factor = self.speed_box.value()
        try:
            sub_wave_stretched = librosa.effects.time_stretch(sub_wave, rate=speed_factor)
        except Exception as e:
            print(f"[TimeStretch Error] fallback to original wave: {e}")
            QMessageBox.warning(
                self,
                "Time Stretch Error",
                f"Could not time stretch audio:\n{e}\nFalling back to original audio."
            )
            sub_wave_stretched = sub_wave

        self._wave_len_no_silence = len(sub_wave_stretched)
        if self._wave_len_no_silence < 1:
            print("Stretched wave is empty. Cannot play.")
            QMessageBox.warning(
                self,
                "Warning",
                "Stretched wave is empty. Cannot play."
            )
            return

        # ====== 固定 blocksize ※この値が小さすぎると音声再生時に音がぶつぶつ切れる ======
        blocksize = 8192
        # =================================

        silence_factor = 5
        min_silence_duration_sec = 0.5
        min_silence_samples = int(self.current_sr * min_silence_duration_sec)
        silence_len = max(blocksize * silence_factor, min_silence_samples)
        extra_silence = np.zeros(silence_len, dtype=sub_wave_stretched.dtype)

        self._waveform_for_playback = np.concatenate((sub_wave_stretched, extra_silence))
        self._current_index = 0

        def audio_callback(outdata, frames, time_info, status):
            if status:
                print(f"Stream status: {status}")
            if self._current_index >= len(self._waveform_for_playback):
                raise sd.CallbackStop

            end_index = min(self._current_index + frames, len(self._waveform_for_playback))
            outblock = self._waveform_for_playback[self._current_index:end_index]

            outdata[:len(outblock), 0] = outblock
            if len(outblock) < frames:
                outdata[len(outblock):] = 0

            self._current_index = end_index

            fraction = self._current_index / float(self._wave_len_no_silence)
            fraction = min(fraction, 1.0)

            progress_time = self._start_pos_for_playback + \
                (self._stop_pos_for_playback - self._start_pos_for_playback) * fraction
            self.position_queue.put(progress_time)

        try:
            self.sd_stream = sd.OutputStream(
                samplerate=self.current_sr,
                channels=1,
                blocksize=blocksize,  # 固定ブロックサイズを設定
                callback=audio_callback
            )
            self.sd_stream.start()
        except Exception as e:
            print(f"Error starting audio stream: {e}")
            QMessageBox.critical(
                self,
                "Playback Error",
                f"Could not start audio stream:\n{e}"
            )

    def poll_queue_and_update_cursor(self):
        """キューから最新の時間情報を取得し、カーソルを更新する"""
        
        latest_time = None
        while not self.position_queue.empty():
            latest_time = self.position_queue.get()

        if latest_time is not None:
            # 音声再生中のカーソルを wave_start に上書き
            if latest_time > self._stop_pos_for_playback:
                latest_time = self._stop_pos_for_playback
            if self.matplotlib_widget.cursor_line_wave_start:
                self.matplotlib_widget.cursor_line_wave_start.set_xdata([latest_time])
            if self.matplotlib_widget.cursor_line_spec_start:
                self.matplotlib_widget.cursor_line_spec_start.set_xdata([latest_time])

            # 再生中のカーソルに合わせて FFTスライスを更新
            self.matplotlib_widget.update_fft_slice(latest_time)

            self.matplotlib_widget.canvas.draw_idle()

        # 再生終了処理
        if self.sd_stream and not self.sd_stream.active:
            # 再生完了後、カーソルを再生開始位置に戻す
            if self.matplotlib_widget.cursor_line_wave_start:
                self.matplotlib_widget.cursor_line_wave_start.set_xdata([self._start_pos_for_playback])
            if self.matplotlib_widget.cursor_line_spec_start:
                self.matplotlib_widget.cursor_line_spec_start.set_xdata([self._start_pos_for_playback])

            # FFT波形も再生開始位置に戻す
            self.matplotlib_widget.update_fft_slice(self._start_pos_for_playback)

            self.matplotlib_widget.canvas.draw_idle()
            self.sd_stream.close()
            self.sd_stream = None

    def stop_playback(self):
        """オーディオ再生を停止する"""
        
        if self.sd_stream:
            if self.sd_stream.active:
                self.sd_stream.stop()
            self.sd_stream.close()
            self.sd_stream = None
        with self.position_queue.mutex:
            self.position_queue.queue.clear()

    def reset_axes(self):
        """Homeボタンで軸設定をデフォルトにリセットする"""
        
        self.matplotlib_widget.reset_axes()

    def closeEvent(self, event):
        """ウィンドウを閉じる際のイベントハンドラー"""
        
        self.stop_playback()
        super().closeEvent(event)


if __name__ == "__main__":
    app = QApplication(sys.argv)
    w = MainWindow()
    w.show()
    sys.exit(app.exec_())

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

1431

1432

1433

1434

1435

1436

1437

1438

1439

1440

1441

1442

1443

1444

1445

1446

1447

1448

1449

1450

1451

1452

1453

1454

1455

1456

1457

1458

1459

1460

1461

1462

1463

1464

1465

1466

1467

1468

1469

1470

1471

1472

1473

1474

1475

1476

1477

1478

1479

1480

1481

1482

1483

1484

1485

1486

1487

1488

1489

1490

1491

1492

1493

1494

1495

1496

1497

1498

1499

1500

1501

1502

1503

1504

1505

1506

1507

1508

1509

1510

1511

1512

1513

1514

1515

1516

1517

1518

1519

1520

import sys

import queue

from functools import partial

import numpy as np

import librosa

import torchaudio

import sounddevice as sd

import pyaudio

import wave

from scipy import signal, fftpack

from PyQt5.QtWidgets import (

QApplication,

QMainWindow,

QVBoxLayout,

QHBoxLayout,

QPushButton,

QFileDialog,

QWidget,

QDoubleSpinBox,

QLabel,

QDialog,

QFormLayout,

QLineEdit,

QComboBox,

QMessageBox,

)

from PyQt5.QtCore import QTimer, pyqtSignal, Qt, QThread

from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas

from matplotlib.patches import Rectangle

import matplotlib.pyplot as plt

# ====== フォントと目盛の設定 ======

plt.rcParams['font.size'] = 14

plt.rcParams['font.family'] = 'Times New Roman'

plt.rcParams['xtick.direction'] = 'in'

plt.rcParams['ytick.direction'] = 'in'

# ====== 設定終了 ======

class AxisSettingsDialog(QDialog):

"""ダイアログウィンドウで軸の設定を行うクラス"""

def __init__(self, parent, current_limits, title, additional_fields=None):

super().__init__(parent)

self.setWindowTitle(title)

self.setModal(True)

self.layout = QFormLayout(self)

self.fields = {}

self.additional_fields = additional_fields if additional_fields else []

# Standard fields

for key in ['Xmin', 'Xmax', 'Ymin', 'Ymax']:

line_edit = QLineEdit(str(current_limits.get(key, 0)))

self.fields[key] = line_edit

self.layout.addRow(f"{key}:", line_edit)

# Additional fields (e.g., Zmin, Zmax)

for key in self.additional_fields:

line_edit = QLineEdit(str(current_limits.get(key, 0)))

self.fields[key] = line_edit

self.layout.addRow(f"{key}:", line_edit)

# Dialog buttons

self.button_box = QHBoxLayout()

self.ok_button = QPushButton("OK")

self.ok_button.clicked.connect(self.accept)

self.cancel_button = QPushButton("Cancel")

self.cancel_button.clicked.connect(self.reject)

self.button_box.addWidget(self.ok_button)

self.button_box.addWidget(self.cancel_button)

self.layout.addRow(self.button_box)

def get_values(self):

"""ユーザーが入力した値を取得する"""

values = {}

for key, widget in self.fields.items():

try:

values[key] = float(widget.text())

except ValueError:

QMessageBox.warning(self, "Input Error", f"Invalid input for {key}.")

return None

return values

class MatplotlibWidget(QWidget):

"""Matplotlibを統合したカスタムウィジェットクラス"""

def __init__(self, parent=None, main_window=None, single_plot=False):

"""

Parameters:

parent: Parent widget.

main_window: Reference to the main window.

single_plot: If True, only spectrogram is displayed with adjusted layout.

"""

super().__init__(parent)

self.main_window = main_window

self.is_recording = False # 録音中かどうかのフラグ

self.layout = QVBoxLayout(self)

if single_plot:

# 単一の大きなスペクトログラム表示用レイアウト

self.figure = plt.figure(figsize=(10, 6))

self.ax_spec = self.figure.add_subplot(111)

self.ax_wave = None

self.ax_fft = None

else:

# 通常のレイアウト

self.figure = plt.figure(constrained_layout=True, figsize=(10, 6)) # 幅を広げてFFT波形用スペースを確保

self.gs = self.figure.add_gridspec(

2, 2,

height_ratios=[1, 1.2],

width_ratios=[1, 0.6] # FFT波形用に幅を増やす

)

# 左カラム: Time波形(上) とスペクトログラム(下)

self.ax_wave = self.figure.add_subplot(self.gs[0, 0])

self.ax_spec = self.figure.add_subplot(self.gs[1, 0])

# 右カラム: FFT波形

self.ax_fft = self.figure.add_subplot(self.gs[:, 1])

self.canvas = FigureCanvas(self.figure)

self.layout.addWidget(self.canvas)

# カラーバー

self.cbar = None

# スペクトログラムの内部データ

self.time_axis = None # 波形表示用 (秒)

self.fft_array = None # スペクトログラム用 [周波数×時間]

self.spec_time_axis = None # スペクトログラムの時間軸 (秒)

self.freq_axis = None # スペクトログラムの周波数軸 (Hz)

# スペクトログラムのイメージオブジェクトの保存

self.spec_im = None

# スペクトログラムのカラー範囲設定

self.user_spec_zmin = None

self.user_spec_zmax = None

self.spec_color_manually_set = False # スペクトログラムのカラー範囲が手動で設定されたかのフラグ

# FFT波形の軸設定

self.user_fft_x_limits = None

self.user_fft_y_limits = None

self.fft_axes_manually_set = False # FFT軸が手動で設定されたかのフラグ

# FFT Y軸の固定範囲

self.fft_y_limits = (None, None)

# デフォルト軸設定の保存

self.default_limits = {

'ax_wave': None,

'ax_spec': None,

'ax_fft': None,

'spec_zmin': None,

'spec_zmax': None

}

# イベント登録

self.canvas.mpl_connect("button_press_event", self.on_mouse_press)

self.canvas.mpl_connect("motion_notify_event", self.on_mouse_drag)

self.canvas.mpl_connect("button_release_event", self.on_mouse_release)

# 矩形選択用

self.is_selecting_area = False

self.rect_start = None

self.selection_rect = None # 現在の四角形オブジェクト

self.rect_edge_color = 'yellow'

self.rect_linewidth = 1.5

# 波形カーソルドラッグ中フラグ

self.is_dragging_wave_start = False

self.is_dragging_wave_stop = False

# カーソル線

self.cursor_line_wave_start = None

self.cursor_line_wave_stop = None

self.cursor_line_spec_start = None

self.cursor_line_spec_stop = None

# 録音時用のスペクトログラム更新用インデックス

self.record_spec_current_index = 0

self.record_spec_total_segments = 0

def _hide_patch(self, patch):

"""

パッチをset_visible(False)にして画面から消す。

リストから物理的に削除せずに済むので、ArtistListが再代入できない問題を回避できる。

"""

if patch is not None:

patch.set_visible(False)

# ================= 波形表示 =================

def plot_waveform(self, waveform, sample_rate):

"""波形をプロットする。"""

if self.ax_wave is None:

return # 単一プロットモードでは波形を表示しない

self.time_axis = np.linspace(0, len(waveform) / sample_rate, len(waveform))

self.ax_wave.clear()

self.ax_wave.plot(self.time_axis, waveform, linewidth=1.0)

self.ax_wave.set_xlabel("Time [s]")

self.ax_wave.set_ylabel("Amplitude")

# 目盛を両側に設定

self.ax_wave.yaxis.set_ticks_position('both')

self.ax_wave.xaxis.set_ticks_position('both')

if len(self.time_axis) > 0:

self.ax_wave.set_xlim(0, self.time_axis[-1])

else:

self.ax_wave.set_xlim(0, 1)

# カーソル線(開始)

if self.cursor_line_wave_start is None:

self.cursor_line_wave_start = self.ax_wave.axvline(0.0, color='red', linestyle='--')

else:

self.cursor_line_wave_start.set_xdata([0.0])

# カーソル線(停止)

if self.cursor_line_wave_stop is None:

stop_x = self.time_axis[-1] if len(self.time_axis) else 0.0

self.cursor_line_wave_stop = self.ax_wave.axvline(stop_x, color='blue', linestyle='--')

else:

stop_x = self.time_axis[-1] if len(self.time_axis) else 0.0

self.cursor_line_wave_stop.set_xdata([stop_x])

self.canvas.draw_idle()

# ================= スペクトログラム表示 =================

def plot_spectrogram(self, waveform, sample_rate, overlap=0, max_chart_time=None, is_recording=False):

"""

スペクトログラムをプロットする。

Parameters:

waveform: numpy array of audio data.

sample_rate: Sampling rate of the audio.

overlap: Overlap percentage between segments.

max_chart_time: Maximum time (in seconds) to display on the spectrogram.

If None, display full spectrogram.

is_recording: Boolean flag indicating if it's recording mode.

"""

# FFTサイズ

Fs = 1024

dBref = 2e-5

Ts = len(waveform) / sample_rate

Fc = Fs / sample_rate

x_ol = Fs * (1 - (overlap / 100))

N_ave = int((Ts - (Fc * (overlap / 100))) / (Fc * (1 - (overlap / 100))))

N_ave = max(N_ave, 1) # N_aveが0になるのを防ぐ

segments = []

for i in range(N_ave):

start_point = int(x_ol * i)

end_point = start_point + Fs

if end_point <= len(waveform):

segments.append(waveform[start_point:end_point])

else:

break

han = signal.windows.hann(Fs)

acf = 1 / (sum(han) / Fs)

segments = [seg * han for seg in segments]

fft_array = []

for seg in segments:

fft_seg = acf * np.abs(fftpack.fft(seg)[:Fs // 2] / (Fs / 2))

fft_seg_db = self.db(fft_seg, dBref)

fft_array.append(fft_seg_db)

fft_array = np.array(fft_array).T

if is_recording:

# 録音時用のスペクトログラム初期化

self.record_spec_total_segments = int(max_chart_time / (Fs / sample_rate))

self.fft_array = np.zeros((Fs // 2, self.record_spec_total_segments))

self.spec_time_axis = np.linspace(0, max_chart_time, self.record_spec_total_segments)

self.record_spec_current_index = 0

else:

# 通常モードではデータをそのまま使用

self.fft_array = fft_array

if fft_array.shape[1] > 0:

self.spec_time_axis = np.linspace(

N_ave * (1 - overlap / 100) * Fs / sample_rate,

fft_array.shape[1]

)

else:

self.spec_time_axis = np.array([])

self.freq_axis = np.linspace(0, sample_rate / 2, Fs // 2)

# 既存の colorbar を remove

if self.cbar is not None:

try:

self.cbar.remove()

except Exception as e:

print("[Warning] cbar.remove() failed:", e)

self.cbar = None

self.ax_spec.clear()

if self.fft_array.size > 0:

im = self.ax_spec.imshow(

self.fft_array,

vmin=np.min(self.fft_array),

vmax=np.max(self.fft_array),

extent=[self.spec_time_axis[0], self.spec_time_axis[-1],

self.freq_axis[0], self.freq_axis[-1]],

aspect='auto',

cmap='jet',

origin='lower'

)

self.spec_im = im # スペクトログラムのイメージオブジェクトを保存

if len(self.spec_time_axis) >= 2:

self.spec_im.set_extent([self.spec_time_axis[0], self.spec_time_axis[-1],

self.freq_axis[0], self.freq_axis[-1]])

self.ax_spec.set_xlim(self.spec_time_axis[0], self.spec_time_axis[-1])

else:

# spec_time_axisに十分なデータがない場合、スペクトログラムの範囲をデフォルトに設定

self.spec_im.set_extent([0, 5, self.freq_axis[0], self.freq_axis[-1]])

self.ax_spec.set_xlim(0, 5)

self.ax_spec.set_ylim(self.freq_axis[0], self.freq_axis[-1])

try:

self.cbar = self.figure.colorbar(im, ax=self.ax_spec, orientation='vertical', pad=0.02)

self.cbar.set_label("Amplitude [dB]")

except Exception as e:

print("[Warning] colorbar creation failed:", e)

self.cbar = None

else:

# データが存在しない場合の設定

if max_chart_time:

self.ax_spec.set_xlim(0, max_chart_time)

self.ax_spec.set_ylim(0, 1)

else:

self.ax_spec.set_xlim(0, 1)

self.ax_spec.set_ylim(0, 1)

self.spec_im = None

self.ax_spec.set_xlabel("Time [s]")

self.ax_spec.set_ylabel("Frequency [Hz]")

# 古い selection_rect があれば非表示に

if self.selection_rect is not None:

self._hide_patch(self.selection_rect)

self.selection_rect = None

self.is_selecting_area = False

# スペクトログラムカーソル再作成

self.cursor_line_spec_start = self.ax_spec.axvline(0.0, color='red', linestyle='--')

if max_chart_time:

stop_x = max_chart_time

else:

stop_x = self.spec_time_axis[-1] if len(self.spec_time_axis) > 0 else 0.0

self.cursor_line_spec_stop = self.ax_spec.axvline(stop_x, color='blue', linestyle='--')

# FFT波形のY軸をスペクトログラムのMax dBに固定

if self.fft_array.size > 0:

self.fft_y_limits = (np.min(self.fft_array), np.max(self.fft_array))

# スペクトログラムのカラー範囲が手動で設定されていない場合のみデフォルト設定を適用

if not self.spec_color_manually_set:

if self.ax_fft:

self.ax_fft.set_xlim(0, sample_rate / 2)

if self.fft_y_limits[0] is not None and self.fft_y_limits[1] is not None:

self.ax_fft.set_ylim(self.fft_y_limits)

# スペクトログラムのデフォルトZmin, Zmaxを保存

self.default_limits['spec_zmin'] = np.min(self.fft_array)

self.default_limits['spec_zmax'] = np.max(self.fft_array)

else:

# スペクトログラムのカラー範囲が手動で設定されている場合はFFT軸には影響しない

if self.spec_im is not None and self.user_spec_zmin is not None and self.user_spec_zmax is not None:

self.spec_im.set_clim(self.user_spec_zmin, self.user_spec_zmax)

if self.cbar:

self.cbar.update_normal(self.spec_im)

else:

self.fft_y_limits = (0, 1)

if not self.spec_color_manually_set:

if self.ax_fft:

self.ax_fft.set_xlim(0, sample_rate / 2)

self.ax_fft.set_ylim(0, 1)

self.default_limits['spec_zmin'] = 0

self.default_limits['spec_zmax'] = 1

else:

if self.spec_im is not None and self.user_spec_zmin is not None and self.user_spec_zmax is not None:

self.spec_im.set_clim(self.user_spec_zmin, self.user_spec_zmax)

if self.cbar:

self.cbar.update_normal(self.spec_im)

# FFT波形のグリッド表示を削除

if self.ax_fft:

self.ax_fft.clear()

self.ax_fft.set_title("FFT")

self.ax_fft.set_xlabel("Frequency [Hz]")

self.ax_fft.set_ylabel("Amplitude [dB]")

# 目盛を両側に設定

self.ax_fft.yaxis.set_ticks_position('both')

self.ax_fft.xaxis.set_ticks_position('both')

self.canvas.draw_idle()

@staticmethod

def db(x, dBref):

"""dB変換を行う。"""

return 20 * np.log10(np.maximum(x, dBref) / dBref)

@staticmethod

def compute_overall_level_db(db_matrix):

"""全体のレベルをdBで計算する。"""

if db_matrix.shape[0] <= 1:

return None

db_matrix_no0 = db_matrix[1:, :]

linear_vals = np.power(10.0, db_matrix_no0 / 10.0)

total_power = np.sum(linear_vals)

if total_power <= 0:

return None

return 10.0 * np.log10(total_power)

# ============ 四角形ドラッグ選択 ============

def start_rectangle_selection(self, xdata, ydata):

"""四角形のドラッグ開始。"""

if self.selection_rect is not None:

self._hide_patch(self.selection_rect)

self.selection_rect = None

self.is_selecting_area = True

self.rect_start = (xdata, ydata)

new_rect = Rectangle(

(xdata, ydata), 0, 0,

edgecolor=self.rect_edge_color,

facecolor='none',

linewidth=self.rect_linewidth

)

self.ax_spec.add_patch(new_rect)

self.selection_rect = new_rect

def update_rectangle_selection(self, xdata, ydata):

"""ドラッグ中: 四角形のサイズを更新"""

if not self.is_selecting_area or self.selection_rect is None:

return

x0, y0 = self.rect_start

self.selection_rect.set_x(min(x0, xdata))

self.selection_rect.set_y(min(y0, ydata))

self.selection_rect.set_width(abs(xdata - x0))

self.selection_rect.set_height(abs(ydata - y0))

self.canvas.draw_idle()

def finish_rectangle_selection(self, xdata, ydata):

"""ドラッグ終了: 四角形を画面に残したままfinish"""

if not self.is_selecting_area or self.selection_rect is None:

return

self.is_selecting_area = False

x0, y0 = self.rect_start

x1, y1 = xdata, ydata

if self.spec_time_axis is None or len(self.spec_time_axis) == 0:

return

if self.freq_axis is None or len(self.freq_axis) == 0:

return

tmin, tmax = sorted([x0, x1])

fmin, fmax = sorted([y0, y1])

tmin = max(tmin, self.spec_time_axis[0]) if len(self.spec_time_axis) > 0 else 0

tmax = min(tmax, self.spec_time_axis[-1]) if len(self.spec_time_axis) > 0 else 0

fmin = max(fmin, self.freq_axis[0]) if len(self.freq_axis) > 0 else 0

fmax = min(fmax, self.freq_axis[-1]) if len(self.freq_axis) > 0 else 0

if tmin >= tmax or fmin >= fmax:

return

time_indices = np.where(

(self.spec_time_axis >= tmin) & (self.spec_time_axis <= tmax)

)[0]

freq_indices = np.where(

(self.freq_axis >= fmin) & (self.freq_axis <= fmax)

)[0]

if len(time_indices) == 0 or len(freq_indices) == 0:

return

selected_region = self.fft_array[np.ix_(freq_indices, time_indices)]

max_val = np.max(selected_region)

freq_idx_in_sel, _ = np.unravel_index(

np.argmax(selected_region), selected_region.shape

)

actual_freq_idx = freq_indices[freq_idx_in_sel]

max_freq = self.freq_axis[actual_freq_idx]

freq_indices_no0 = freq_indices[freq_indices > 0]

if len(freq_indices_no0) == 0:

partial_oa_db = None

else:

partial_region_no0 = selected_region[

freq_indices_no0 - freq_indices_no0.min(), :

]

partial_oa_db = self.compute_overall_level_db(partial_region_no0)

# 計算結果を MainWindow に通知

if self.main_window:

self.main_window.update_max_info(max_val, max_freq, is_whole=False)

self.main_window.update_oa_info(partial_oa_db, is_whole=False)

self.canvas.draw_idle()

# ============ マウスイベント ============

def on_mouse_press(self, event):

"""マウスクリックイベントのハンドラー"""

if event.dblclick:

# ダブルクリック時に軸設定ダイアログを開く

self.open_axis_settings_dialog(event)

return

if self.ax_wave and event.inaxes == self.ax_wave and event.button == 1:

xdata = event.xdata

if xdata is None:

return

if self.cursor_line_wave_start is None or self.cursor_line_wave_stop is None:

return

start_pos = self.cursor_line_wave_start.get_xdata()[0]

stop_pos = self.cursor_line_wave_stop.get_xdata()[0]

if abs(xdata - start_pos) < abs(xdata - stop_pos):

self.is_dragging_wave_start = True

else:

self.is_dragging_wave_stop = True

elif self.ax_spec and event.inaxes == self.ax_spec and event.button == 1:

# 新しい四角形ドラッグ開始

if event.xdata is not None and event.ydata is not None:

self.start_rectangle_selection(event.xdata, event.ydata)

def on_mouse_drag(self, event):

"""マウスドラッグイベントのハンドラー"""

if self.is_dragging_wave_start and self.ax_wave and event.inaxes == self.ax_wave:

if event.xdata is not None:

self.cursor_line_wave_start.set_xdata([event.xdata])

self._adjust_stop_cursor_if_needed()

if self.cursor_line_spec_start:

self.cursor_line_spec_start.set_xdata([event.xdata])

# --- FFTスライス更新 ---

self.update_fft_slice(event.xdata)

self.canvas.draw_idle()

elif self.is_dragging_wave_stop and self.ax_wave and event.inaxes == self.ax_wave:

if event.xdata is not None:

self.cursor_line_wave_stop.set_xdata([event.xdata])

self._adjust_stop_cursor_if_needed()

if self.cursor_line_spec_stop:

self.cursor_line_spec_stop.set_xdata([event.xdata])

# stopカーソル移動時は FFTスライス更新しない(赤カーソルのみ更新)

self.canvas.draw_idle()

elif self.is_selecting_area and self.ax_spec and event.inaxes == self.ax_spec:

if event.xdata is not None and event.ydata is not None:

self.update_rectangle_selection(event.xdata, event.ydata)

def on_mouse_release(self, event):

"""マウスリリースイベントのハンドラー"""

if self.is_dragging_wave_start or self.is_dragging_wave_stop:

self.is_dragging_wave_start = False

self.is_dragging_wave_stop = False

return

if self.is_selecting_area and self.ax_spec and event.inaxes == self.ax_spec:

if event.xdata is not None and event.ydata is not None:

self.finish_rectangle_selection(event.xdata, event.ydata)

def _adjust_stop_cursor_if_needed(self):

"""ストップカーソルの位置を調整する"""

if self.ax_wave is None:

return

start_wave = self.cursor_line_wave_start.get_xdata()[0]

stop_wave = self.cursor_line_wave_stop.get_xdata()[0]

if stop_wave < start_wave:

forced_stop = start_wave + 0.001

if self.time_axis is not None and len(self.time_axis) > 0:

if forced_stop > self.time_axis[-1]:

forced_stop = self.time_axis[-1]

self.cursor_line_wave_stop.set_xdata([forced_stop])

# ============ 追加: FFT波形表示 ============

def update_fft_slice(self, time_sec):

"""

与えられた time_sec における FFT振幅スペクトルを ax_fft に描画。

Parameters:

time_sec: 時間（秒）。

"""

if self.fft_array is None or self.fft_array.size == 0:

return

if self.spec_time_axis is None or len(self.spec_time_axis) == 0:

return

if self.freq_axis is None or len(self.freq_axis) == 0:

return

# time_sec が spec_time_axis の範囲外の場合はクリップ

if time_sec < self.spec_time_axis[0]:

time_sec = self.spec_time_axis[0]

if time_sec > self.spec_time_axis[-1]:

time_sec = self.spec_time_axis[-1]

# 指定した time_sec に一番近いスペクトログラム上の列インデックスを探す

idx = np.searchsorted(self.spec_time_axis, time_sec)

idx = max(0, min(idx, self.fft_array.shape[1] - 1))

# 周波数軸 vs振幅[dB] ( = self.fft_array[:, idx] ) を取得

slice_db = self.fft_array[:, idx]

if self.ax_fft is None:

return # 単一プロットモードではFFTを表示しない

# ax_fft 再描画

self.ax_fft.clear()

self.ax_fft.plot(self.freq_axis, slice_db, color='magenta', linewidth=1.0)

self.ax_fft.set_title(f"FFT @ t={time_sec:.3f}s")

self.ax_fft.set_xlabel("Frequency [Hz]")

self.ax_fft.set_ylabel("Amplitude [dB]")

# 目盛を両側に設定

self.ax_fft.yaxis.set_ticks_position('both')

self.ax_fft.xaxis.set_ticks_position('both')

# FFTの軸を固定スケールに設定

if not self.fft_axes_manually_set:

if self.main_window and self.main_window.current_sr:

self.ax_fft.set_xlim(0, self.main_window.current_sr / 2)

else:

self.ax_fft.set_xlim(0, self.freq_axis[-1] if len(self.freq_axis) > 0 else 1)

if self.fft_y_limits[0] is not None and self.fft_y_limits[1] is not None:

self.ax_fft.set_ylim(self.fft_y_limits)

# ユーザーが手動設定した場合は、設定された軸を維持

else:

if self.user_fft_x_limits and self.user_fft_y_limits:

self.ax_fft.set_xlim(self.user_fft_x_limits)

self.ax_fft.set_ylim(self.user_fft_y_limits)

self.canvas.draw_idle()

# ============ 軸設定ダイアログの表示 ============

def open_axis_settings_dialog(self, event):

"""

ダブルクリックされた軸に基づいて軸設定ダイアログを開く。

Parameters:

event: マウスイベントオブジェクト。

"""

if self.ax_wave and event.inaxes == self.ax_wave:

current_limits = {

'Xmin': self.ax_wave.get_xlim()[0],

'Xmax': self.ax_wave.get_xlim()[1],

'Ymin': self.ax_wave.get_ylim()[0],

'Ymax': self.ax_wave.get_ylim()[1],

}

title = "Set Time Waveform Axis Limits"

dialog = AxisSettingsDialog(self, current_limits, title)

if dialog.exec_() == QDialog.Accepted:

new_limits = dialog.get_values()

if new_limits:

self.ax_wave.set_xlim(new_limits['Xmin'], new_limits['Xmax'])

self.ax_wave.set_ylim(new_limits['Ymin'], new_limits['Ymax'])

self.canvas.draw_idle()

# 更新後のFFTも再描画

current_time = self.cursor_line_wave_start.get_xdata()[0]

self.update_fft_slice(current_time)

elif self.ax_spec and event.inaxes == self.ax_spec:

if self.spec_im is not None:

current_limits = {

'Xmin': self.ax_spec.get_xlim()[0],

'Xmax': self.ax_spec.get_xlim()[1],

'Ymin': self.ax_spec.get_ylim()[0],

'Ymax': self.ax_spec.get_ylim()[1],

'Zmin': self.spec_im.get_clim()[0],

'Zmax': self.spec_im.get_clim()[1],

}

else:

current_limits = {

'Xmin': 0,

'Xmax': 1,

'Ymin': 0,

'Ymax': 1,

'Zmin': 0,

'Zmax': 1,

}

title = "Set Spectrogram Axis Limits"

dialog = AxisSettingsDialog(

self,

current_limits,

title,

additional_fields=['Zmin', 'Zmax']

)

if dialog.exec_() == QDialog.Accepted:

new_limits = dialog.get_values()

if new_limits:

self.ax_spec.set_xlim(new_limits['Xmin'], new_limits['Xmax'])

self.ax_spec.set_ylim(new_limits['Ymin'], new_limits['Ymax'])

# Set colorbar limits

if 'Zmin' in new_limits and 'Zmax' in new_limits:

if self.spec_im is not None:

self.spec_im.set_clim(new_limits['Zmin'], new_limits['Zmax'])

if self.cbar:

self.cbar.update_normal(self.spec_im)

self.canvas.draw_idle()

# Update user settings for spectrogram color range

self.spec_color_manually_set = True

self.user_spec_zmin = new_limits.get('Zmin', None)

self.user_spec_zmax = new_limits.get('Zmax', None)

elif self.ax_fft and event.inaxes == self.ax_fft:

current_limits = {

'Xmin': self.ax_fft.get_xlim()[0],

'Xmax': self.ax_fft.get_xlim()[1],

'Ymin': self.ax_fft.get_ylim()[0],

'Ymax': self.ax_fft.get_ylim()[1],

}

title = "Set FFT Axis Limits"

dialog = AxisSettingsDialog(self, current_limits, title)

if dialog.exec_() == QDialog.Accepted:

new_limits = dialog.get_values()

if new_limits:

self.ax_fft.set_xlim(new_limits['Xmin'], new_limits['Xmax'])

self.ax_fft.set_ylim(new_limits['Ymin'], new_limits['Ymax'])

self.canvas.draw_idle()

# 手動設定フラグを立てて、設定を保存

self.fft_axes_manually_set = True

self.user_fft_x_limits = (new_limits['Xmin'], new_limits['Xmax'])

self.user_fft_y_limits = (new_limits['Ymin'], new_limits['Ymax'])

# ============ 軸設定のデフォルト保存とリセット ============

def set_default_limits(self):

"""現在の軸設定をデフォルトとして保存"""

if self.ax_wave:

self.default_limits['ax_wave'] = self.ax_wave.get_xlim() + self.ax_wave.get_ylim()

if self.ax_spec:

self.default_limits['ax_spec'] = self.ax_spec.get_xlim() + self.ax_spec.get_ylim()

if self.ax_fft:

self.default_limits['ax_fft'] = self.ax_fft.get_xlim() + self.ax_fft.get_ylim()

if self.spec_im is not None:

self.default_limits['spec_zmin'] = self.spec_im.get_clim()[0]

self.default_limits['spec_zmax'] = self.spec_im.get_clim()[1]

else:

self.default_limits['spec_zmin'] = 0

self.default_limits['spec_zmax'] = 1

def reset_axes(self):

"""Homeボタンで軸設定をデフォルトにリセット"""

if self.ax_wave and self.default_limits['ax_wave']:

self.ax_wave.set_xlim(

self.default_limits['ax_wave'][0],

self.default_limits['ax_wave'][1]

)

self.ax_wave.set_ylim(

self.default_limits['ax_wave'][2],

self.default_limits['ax_wave'][3]

)

if self.ax_spec and self.default_limits['ax_spec']:

self.ax_spec.set_xlim(

self.default_limits['ax_spec'][0],

self.default_limits['ax_spec'][1]

)

self.ax_spec.set_ylim(

self.default_limits['ax_spec'][2],

self.default_limits['ax_spec'][3]

)

# スペクトログラムのカラー範囲をデフォルトにリセット

if self.spec_im is not None and 'spec_zmin' in self.default_limits and 'spec_zmax' in self.default_limits:

self.spec_im.set_clim(

self.default_limits['spec_zmin'],

self.default_limits['spec_zmax']

)

if self.cbar:

self.cbar.update_normal(self.spec_im)

if self.ax_fft and self.default_limits['ax_fft']:

self.ax_fft.set_xlim(

self.default_limits['ax_fft'][0],

self.default_limits['ax_fft'][1]

)

self.ax_fft.set_ylim(

self.default_limits['ax_fft'][2],

self.default_limits['ax_fft'][3]

)

# Reset user settings

self.spec_color_manually_set = False

self.user_spec_zmin = None

self.user_spec_zmax = None

self.fft_axes_manually_set = False

self.user_fft_x_limits = None

self.user_fft_y_limits = None

self.canvas.draw_idle()

class RecordThread(QThread):

"""録音をバックグラウンドで行うスレッドクラス"""

data_signal = pyqtSignal(np.ndarray)

finished_signal = pyqtSignal(str)

error_signal = pyqtSignal(str)

def __init__(self, mic_index, samplerate, frames_per_buffer, record_seconds, filename):

super().__init__()

self.mic_index = mic_index

self.samplerate = samplerate

self.frames_per_buffer = frames_per_buffer

self.record_seconds = record_seconds

self.filename = filename

self._is_running = True

def run(self):

"""スレッドの実行部分"""

try:

pa = pyaudio.PyAudio()

stream = pa.open(

format=pyaudio.paInt16,

channels=1,

rate=self.samplerate,

input=True,

frames_per_buffer=self.frames_per_buffer,

input_device_index=self.mic_index

)

frames = []

total_frames = int(self.samplerate / self.frames_per_buffer * self.record_seconds)

for _ in range(total_frames):

if not self._is_running:

break

data = stream.read(self.frames_per_buffer, exception_on_overflow=False)

frames.append(data)

# Convert bytes to numpy array

audio_data = np.frombuffer(data, dtype=np.int16).astype(np.float32) / 32768.0

self.data_signal.emit(audio_data)

stream.stop_stream()

stream.close()

pa.terminate()

# Save WAV file

wf = wave.open(self.filename, 'wb')

wf.setnchannels(1)

wf.setsampwidth(pa.get_sample_size(pyaudio.paInt16))

wf.setframerate(self.samplerate)

wf.writeframes(b''.join(frames))

wf.close()

self.finished_signal.emit(self.filename)

except Exception as e:

self.error_signal.emit(str(e))

def stop(self):

"""録音を停止する"""

self._is_running = False

class RecordWindow(QDialog):

"""録音用のダイアログウィンドウクラス"""

recording_finished = pyqtSignal(str)

def __init__(self, parent=None):

super().__init__(parent)

self.setWindowTitle("Record WAV")

self.setModal(True)

self.layout = QVBoxLayout(self)

self.form_layout = QFormLayout()

self.layout.addLayout(self.form_layout)

# 録音時間設定

self.record_time_spin = QDoubleSpinBox(self)

self.record_time_spin.setRange(1.0, 600.0) # 最大10分

self.record_time_spin.setValue(5.0)

self.record_time_spin.setSingleStep(1.0)

self.form_layout.addRow("Record Time [s]:", self.record_time_spin)

# マイク選択プルダウン

self.mic_combo = QComboBox(self)

self.populate_microphones()

self.form_layout.addRow("Select Microphone:", self.mic_combo)

# WAVファイル名入力欄

self.filename_layout = QHBoxLayout()

self.filename_edit = QLineEdit(self)

self.filename_edit.setPlaceholderText("recorded")

self.filename_layout.addWidget(self.filename_edit)

self.filename_layout.addWidget(QLabel(".wav"))

self.form_layout.addRow("Filename:", self.filename_layout)

# 録音ボタンとCloseボタン

self.button_layout = QHBoxLayout()

self.record_button = QPushButton("Record", self)

self.record_button.clicked.connect(self.toggle_recording)

self.button_layout.addWidget(self.record_button)

self.close_button = QPushButton("Close", self)

self.close_button.clicked.connect(self.close)

self.button_layout.addWidget(self.close_button)

self.layout.addLayout(self.button_layout)

# スペクトログラム表示

self.spec_widget = MatplotlibWidget(

self, main_window=self.parent(), single_plot=True

) # single_plot=True でスペクトログラムのみ表示

self.layout.addWidget(self.spec_widget)

# 録音関連

self.record_thread = None

def populate_microphones(self):

"""マイクのリストを取得してプルダウンに追加する"""

pa = pyaudio.PyAudio()

self.mic_dict = {} # 名前とインデックスのマッピング

for i in range(pa.get_device_count()):

info = pa.get_device_info_by_index(i)

if info['maxInputChannels'] > 0:

name = info['name']

self.mic_combo.addItem(name)

self.mic_dict[name] = i

pa.terminate()

if self.mic_combo.count() == 0:

self.mic_combo.addItem("No Microphone Found")

self.record_button.setEnabled(False)

def toggle_recording(self):

"""録音ボタンのトグル動作を行う"""

if not hasattr(self, 'is_recording') or not self.is_recording:

# Start recording

self.start_recording()

else:

# Stop recording

self.stop_recording()

def start_recording(self):

"""録音を開始する"""

# 入力値の取得

record_time = self.record_time_spin.value()

selected_mic_name = self.mic_combo.currentText()

mic_index = self.mic_dict.get(selected_mic_name, None)

filename = self.filename_edit.text().strip()

if not filename:

filename = "recorded"

if not filename.lower().endswith(".wav"):

filename += ".wav"

# マイクが選択されているか確認

if mic_index is None:

QMessageBox.warning(self, "Error", "No microphone selected.")

return

# 録音開始

self.is_recording = True

self.record_button.setText("Stop")

self.record_button.setStyleSheet("background-color: red")

self.close_button.setEnabled(False)

self.record_time_spin.setEnabled(False)

self.mic_combo.setEnabled(False)

self.filename_edit.setEnabled(False)

# Initialize spectrogram for recording

self.spec_widget.is_recording = True

self.spec_widget.plot_spectrogram(

np.array([]), 44100, max_chart_time=record_time, is_recording=True

)

# Start recording thread

self.record_thread = RecordThread(

mic_index=mic_index,

samplerate=44100, # サンプリングレートを設定

frames_per_buffer=1024, # fft_size=1024に合わせる

record_seconds=record_time,

filename=filename

)

self.record_thread.data_signal.connect(self.update_spectrogram)

self.record_thread.finished_signal.connect(self.on_recording_finished)

self.record_thread.error_signal.connect(self.on_recording_error)

self.record_thread.start()

def stop_recording(self):

"""録音を停止する"""

if self.record_thread and self.record_thread.isRunning():

self.record_thread.stop()

self.record_thread.wait()

def update_spectrogram(self, audio_data):

"""リアルタイムスペクトログラムを更新する関数"""

# fft_size=1024, overlap=0

fft_size = 1024

overlap = 0 # オーバーラップ率0

max_chart_time = self.record_time_spin.value() # 録音時間に基づく

# FFT

spectrum = fftpack.fft(audio_data, n=fft_size)

amp = np.abs(spectrum[:fft_size // 2]) / (fft_size / 2)

amp_db = self.spec_widget.db(amp, 2e-5) # dBrefを統一

# 録音時のみスペクトログラムを更新

if self.spec_widget.is_recording:

if (self.spec_widget.fft_array is not None and

self.spec_widget.record_spec_current_index < self.spec_widget.record_spec_total_segments):

# データを既存のスペクトログラムに書き換える

self.spec_widget.fft_array[:, self.spec_widget.record_spec_current_index] = amp_db

self.spec_widget.record_spec_current_index += 1

# カラーバーの最小値と最大値を計算

current_min = np.min(self.spec_widget.fft_array)

current_max = np.max(self.spec_widget.fft_array)

# 既存のカラーバー範囲と比較

if self.spec_widget.cbar:

current_clim = self.spec_widget.spec_im.get_clim()

if current_min < current_clim[0] or current_max > current_clim[1]:

# カラーバーの範囲を更新

self.spec_widget.spec_im.set_clim(current_min, current_max)

self.spec_widget.cbar.update_normal(self.spec_widget.spec_im)

# スペクトログラムの再描画

self.spec_widget.spec_im.set_data(self.spec_widget.fft_array)

self.spec_widget.canvas.draw_idle()

def on_recording_finished(self, file_path):

"""録音が終了した後に呼ばれるシグナルハンドラー"""

print(f"Recording finished. File path: {file_path}") # デバッグ用

if file_path:

# メインウィンドウでのみメッセージボックスを表示

self.recording_finished.emit(file_path)

else:

QMessageBox.warning(self, "Recording Failed", "Recording was not successful.")

# ボタンのリセット

self.is_recording = False

self.record_button.setText("Record")

self.record_button.setStyleSheet("")

self.close_button.setEnabled(True)

self.record_time_spin.setEnabled(True)

self.mic_combo.setEnabled(True)

self.filename_edit.setEnabled(True)

# 録音ウィンドウを自動的に閉じる

self.close()

def on_recording_error(self, error_message):

"""録音中にエラーが発生した場合に呼ばれるシグナルハンドラー"""

QMessageBox.critical(self, "Recording Error", error_message)

self.stop_recording()

self.close()

def closeEvent(self, event):

"""ウィンドウを閉じる際のイベントハンドラー"""

if hasattr(self, 'is_recording') and self.is_recording:

reply = QMessageBox.question(

self,

'Recording in progress',

'Recording is still in progress. Do you want to stop and close?',

QMessageBox.Yes | QMessageBox.No,

QMessageBox.No

)

if reply == QMessageBox.Yes:

self.stop_recording()

else:

event.ignore()

return

event.accept()

class MainWindow(QMainWindow):

"""アプリケーションのメインウィンドウクラス"""

def __init__(self):

super().__init__()

self.setWindowTitle("Wav Analyzer Tools")

self.main_widget = QWidget(self)

self.setCentralWidget(self.main_widget)

self.main_hbox = QHBoxLayout(self.main_widget)

# 左側

self.left_widget = QWidget(self)

self.left_layout = QVBoxLayout(self.left_widget)

self.main_hbox.addWidget(self.left_widget, stretch=1)

self.matplotlib_widget = MatplotlibWidget(self, main_window=self)

self.left_layout.addWidget(self.matplotlib_widget)

self.button_layout = QHBoxLayout()

self.left_layout.addLayout(self.button_layout)

# Record WAVボタン

self.record_button = QPushButton("Record WAV", self)

self.record_button.clicked.connect(self.open_record_window)

self.button_layout.addWidget(self.record_button)

# Load WAVボタン

self.load_button = QPushButton("Load WAV", self)

self.load_button.clicked.connect(lambda: self.load_wav())

self.button_layout.addWidget(self.load_button)

# Playボタン

self.play_button = QPushButton("Play", self)

self.play_button.clicked.connect(self.play_audio)

self.button_layout.addWidget(self.play_button)

# Stopボタン

self.stop_button = QPushButton("Stop", self)

self.stop_button.clicked.connect(self.stop_playback) # Stopボタンのクリックイベントにstop_playbackを接続

self.button_layout.addWidget(self.stop_button)

# 再生速度変更用変数

self.speed_label = QLabel("Speed Factor", self)

self.button_layout.addWidget(self.speed_label)

self.speed_box = QDoubleSpinBox(self)

self.speed_box.setRange(0.1, 10.0) # 再生速度は10倍までにしておく

self.speed_box.setValue(1.0)

self.speed_box.setSingleStep(0.1)

self.button_layout.addWidget(self.speed_box)

# Homeボタン

self.home_button = QPushButton("Home", self)

self.home_button.clicked.connect(self.reset_axes)

self.left_layout.addWidget(self.home_button)

# 右側 (Max値, O.A. 表示)

self.info_widget = QWidget(self)

self.info_layout = QVBoxLayout(self.info_widget)

self.main_hbox.addWidget(self.info_widget, stretch=0)

self.label_title = QLabel("<b>Spectrogram Info</b>", self)

self.info_layout.addWidget(self.label_title)

self.label_max_value = QLabel("Max Value[dB]: ---", self)

self.info_layout.addWidget(self.label_max_value)

self.label_max_freq = QLabel("Max Freq[Hz]: ---", self)

self.info_layout.addWidget(self.label_max_freq)

self.label_oa_value = QLabel("O.A.[dB]: ---", self)

self.info_layout.addWidget(self.label_oa_value)

self.info_layout.addStretch(1)

self.current_waveform = None

self.current_sr = None

self._waveform_for_playback = None

self._current_index = 0

self._wave_len_no_silence = 1

self._start_pos_for_playback = 0.0

self._stop_pos_for_playback = 0.0

self.sd_stream = None

self.position_queue = queue.Queue()

self.update_timer = QTimer()

self.update_timer.setInterval(20)

self.update_timer.timeout.connect(self.poll_queue_and_update_cursor)

self.update_timer.start()

def open_record_window(self):

"""Record WAVボタンがクリックされたときに録音ウィンドウを開く"""

self.record_window = RecordWindow(self)

self.record_window.recording_finished.connect(self.on_recording_finished)

self.record_window.show()

def on_recording_finished(self, file_path):

"""録音が終了した後に呼ばれるシグナルハンドラー"""

print(f"Recording finished. File path: {file_path}") # デバッグ用

if file_path:

# メインウィンドウでのみメッセージボックスを表示

QMessageBox.information(

self,

"Recording Finished",

f"Recording saved to {file_path}"

)

self.load_wav(file_path)

else:

QMessageBox.warning(self, "Recording Failed", "Recording was not successful.")

def load_wav(self, file_path=None):

"""WAVファイルを読み込み、表示する"""

print("load_wav called") # デバッグ用

self.stop_playback()

if file_path is None:

file_path, _ = QFileDialog.getOpenFileName(

self, "Select WAV File", "", "WAV Files (*.wav)"

)

if not file_path:

return

if not isinstance(file_path, str):

print(f"Invalid file path: {file_path}")

return

try:

waveform, sr = torchaudio.load(file_path)

except Exception as e:

print(f"Error loading WAV file: {e}")

QMessageBox.critical(

self,

"Error",

f"Could not load WAV file:\n{e}"

)

return

if waveform.shape[0] > 1:

waveform = waveform.mean(dim=0, keepdim=True)

self.current_waveform = waveform.numpy()[0].copy()

self.current_sr = sr

# カーソルリセット

self.matplotlib_widget.cursor_line_wave_start = None

self.matplotlib_widget.cursor_line_wave_stop = None

self.matplotlib_widget.cursor_line_spec_start = None

self.matplotlib_widget.cursor_line_spec_stop = None

# 波形/スペクトログラム更新

try:

self.matplotlib_widget.plot_waveform(self.current_waveform, self.current_sr)

self.matplotlib_widget.plot_spectrogram(

self.current_waveform,

self.current_sr,

max_chart_time=None,

is_recording=False

) # 全スペクトログラム表示

self.update_whole_spectrogram_info()

self.update_whole_oa_info()

except Exception as e:

print(f"Error in plotting: {e}")

QMessageBox.critical(

self,

"Error",

f"Error in plotting:\n{e}"

)

return

# 0[s] にカーソルを合わせて、FFTスライスを更新

if self.matplotlib_widget.cursor_line_wave_start:

self.matplotlib_widget.cursor_line_wave_start.set_xdata([0.0])

if self.matplotlib_widget.cursor_line_spec_start:

self.matplotlib_widget.cursor_line_spec_start.set_xdata([0.0])

self.matplotlib_widget.update_fft_slice(0.0)

self.matplotlib_widget.canvas.draw_idle()

# デフォルト軸設定を保存

self.matplotlib_widget.set_default_limits()

# スペクトログラムのカラー範囲設定が手動で設定されていない場合、フラグをリセット

self.matplotlib_widget.spec_color_manually_set = False

self.matplotlib_widget.user_spec_zmin = None

self.matplotlib_widget.user_spec_zmax = None

# FFT軸設定が手動で設定されていない場合、フラグをリセット

self.matplotlib_widget.fft_axes_manually_set = False

self.matplotlib_widget.user_fft_x_limits = None

self.matplotlib_widget.user_fft_y_limits = None

def update_whole_spectrogram_info(self):

"""スペクトログラム全体の情報を更新する"""

fft_array = self.matplotlib_widget.fft_array

freq_axis = self.matplotlib_widget.freq_axis

if fft_array is None or fft_array.size == 0:

self.label_max_value.setText("Max Value[dB]: ---")

self.label_max_freq.setText("Max Freq[Hz]: ---")

return

max_val = np.max(fft_array)

freq_idx, _ = np.unravel_index(np.argmax(fft_array), fft_array.shape)

if freq_idx < len(freq_axis):

max_freq = freq_axis[freq_idx]

else:

max_freq = 0.0

self.update_max_info(max_val, max_freq, is_whole=True)

def update_whole_oa_info(self):

"""スペクトログラム全体のO.A.情報を更新する"""

fft_array = self.matplotlib_widget.fft_array

if fft_array is None or fft_array.size == 0:

self.update_oa_info(None, is_whole=True)

return

oa_db = MatplotlibWidget.compute_overall_level_db(fft_array)

self.update_oa_info(oa_db, is_whole=True)

def update_max_info(self, max_val_db, max_freq, is_whole=False):

"""最大値情報を更新する"""

if is_whole:

self.label_max_value.setText(f"Max Value[dB]: {max_val_db:.2f} (whole)")

self.label_max_freq.setText(f"Max Freq[Hz]: {max_freq:.2f} (whole)")

else:

self.label_max_value.setText(f"Max Value [dB]: {max_val_db:.2f} (sel)")

self.label_max_freq.setText(f"Max Freq[Hz]: {max_freq:.2f} (sel)")

def update_oa_info(self, oa_db, is_whole=False):

"""O.A.情報を更新する"""

if oa_db is None:

if is_whole:

self.label_oa_value.setText("O.A.[dB]: --- (whole)")

else:

self.label_oa_value.setText("O.A.[dB]: --- (sel)")

else:

if is_whole:

self.label_oa_value.setText(f"O.A.[dB]: {oa_db:.2f} (whole)")

else:

self.label_oa_value.setText(f"O.A.[dB]: {oa_db:.2f} (sel)")

def play_audio(self):

"""選択された範囲のオーディオを再生する"""

if self.current_waveform is None or self.current_sr is None:

print("No audio loaded.")

QMessageBox.warning(self, "Warning", "No audio loaded.")

return

self.stop_playback()

if (self.matplotlib_widget.cursor_line_wave_start is None or

self.matplotlib_widget.cursor_line_wave_stop is None):

print("Cursor lines are not set.")

QMessageBox.warning(self, "Warning", "Cursor lines are not set.")

return

start_pos = self.matplotlib_widget.cursor_line_wave_start.get_xdata()[0]

stop_pos = self.matplotlib_widget.cursor_line_wave_stop.get_xdata()[0]

if stop_pos <= start_pos:

print("停止位置が開始位置より前です。再生できません。")

QMessageBox.warning(

self,

"Warning",

"Stop position is before start position. Cannot play."

)

return

self._start_pos_for_playback = start_pos

self._stop_pos_for_playback = stop_pos

start_idx = int(start_pos * self.current_sr)

stop_idx = int(stop_pos * self.current_sr)

start_idx = max(0, min(start_idx, len(self.current_waveform)))

stop_idx = max(0, min(stop_idx, len(self.current_waveform)))

if stop_idx <= start_idx:

print("再生範囲が0です。再生できません。")

QMessageBox.warning(

self,

"Warning",

"Playback range is zero. Cannot play."

)

return

sub_wave = self.current_waveform[start_idx:stop_idx].astype(np.float32)

speed_factor = self.speed_box.value()

try:

sub_wave_stretched = librosa.effects.time_stretch(sub_wave, rate=speed_factor)

except Exception as e:

print(f"[TimeStretch Error] fallback to original wave: {e}")

QMessageBox.warning(

self,

"Time Stretch Error",

f"Could not time stretch audio:\n{e}\nFalling back to original audio."

)

sub_wave_stretched = sub_wave

self._wave_len_no_silence = len(sub_wave_stretched)

if self._wave_len_no_silence < 1:

print("Stretched wave is empty. Cannot play.")

QMessageBox.warning(

self,

"Warning",

"Stretched wave is empty. Cannot play."

)

return

# ====== 固定 blocksize ※この値が小さすぎると音声再生時に音がぶつぶつ切れる ======

blocksize = 8192

# =================================

silence_factor = 5

min_silence_duration_sec = 0.5

min_silence_samples = int(self.current_sr * min_silence_duration_sec)

silence_len = max(blocksize * silence_factor, min_silence_samples)

extra_silence = np.zeros(silence_len, dtype=sub_wave_stretched.dtype)

self._waveform_for_playback = np.concatenate((sub_wave_stretched, extra_silence))

self._current_index = 0

def audio_callback(outdata, frames, time_info, status):

if status:

print(f"Stream status: {status}")

if self._current_index >= len(self._waveform_for_playback):

raise sd.CallbackStop

end_index = min(self._current_index + frames, len(self._waveform_for_playback))

outblock = self._waveform_for_playback[self._current_index:end_index]

outdata[:len(outblock), 0] = outblock

if len(outblock) < frames:

outdata[len(outblock):] = 0

self._current_index = end_index

fraction = self._current_index / float(self._wave_len_no_silence)

fraction = min(fraction, 1.0)

progress_time = self._start_pos_for_playback + \

(self._stop_pos_for_playback - self._start_pos_for_playback) * fraction

self.position_queue.put(progress_time)

try:

self.sd_stream = sd.OutputStream(

samplerate=self.current_sr,

channels=1,

blocksize=blocksize, # 固定ブロックサイズを設定

callback=audio_callback

)

self.sd_stream.start()

except Exception as e:

print(f"Error starting audio stream: {e}")

QMessageBox.critical(

self,

"Playback Error",

f"Could not start audio stream:\n{e}"

)

def poll_queue_and_update_cursor(self):

"""キューから最新の時間情報を取得し、カーソルを更新する"""

latest_time = None

while not self.position_queue.empty():

latest_time = self.position_queue.get()

if latest_time is not None:

# 音声再生中のカーソルを wave_start に上書き

if latest_time > self._stop_pos_for_playback:

latest_time = self._stop_pos_for_playback

if self.matplotlib_widget.cursor_line_wave_start:

self.matplotlib_widget.cursor_line_wave_start.set_xdata([latest_time])

if self.matplotlib_widget.cursor_line_spec_start:

self.matplotlib_widget.cursor_line_spec_start.set_xdata([latest_time])

# 再生中のカーソルに合わせて FFTスライスを更新

self.matplotlib_widget.update_fft_slice(latest_time)

self.matplotlib_widget.canvas.draw_idle()

# 再生終了処理

if self.sd_stream and not self.sd_stream.active:

# 再生完了後、カーソルを再生開始位置に戻す

if self.matplotlib_widget.cursor_line_wave_start:

self.matplotlib_widget.cursor_line_wave_start.set_xdata([self._start_pos_for_playback])

if self.matplotlib_widget.cursor_line_spec_start:

self.matplotlib_widget.cursor_line_spec_start.set_xdata([self._start_pos_for_playback])

# FFT波形も再生開始位置に戻す

self.matplotlib_widget.update_fft_slice(self._start_pos_for_playback)

self.matplotlib_widget.canvas.draw_idle()

self.sd_stream.close()

self.sd_stream = None

def stop_playback(self):

"""オーディオ再生を停止する"""

if self.sd_stream:

if self.sd_stream.active:

self.sd_stream.stop()

self.sd_stream.close()

self.sd_stream = None

with self.position_queue.mutex:

self.position_queue.queue.clear()

def reset_axes(self):

"""Homeボタンで軸設定をデフォルトにリセットする"""

self.matplotlib_widget.reset_axes()

def closeEvent(self, event):

"""ウィンドウを閉じる際のイベントハンドラー"""

self.stop_playback()

super().closeEvent(event)

if __name__ == "__main__":

app = QApplication(sys.argv)

w = MainWindow()

w.show()

sys.exit(app.exec_())

リアルタイム録音機能

　最初に紹介する機能はリアルタイム録音機能です。次の画像はプログラムを実行した直後の画面ですが、ボタン群は3つのグラフの下にまとめています。それぞれの説明は順にしていくとして、録音機能はRecord WAVボタンをクリックします。

　Record WAVウィンドウが表示されます。

　ここでは次の設定を順に行います。

Record Time [s]
　録音時間です。指定した秒数経過すると自動的に録音が停止します。
Select Microphone
　録音に使用するマイクを選択します。コードを立ち上げた段階で、自動的に使用可能なマイクがプルダウンリストに登録されます。もしUSB接続タイプやBluetooth通信タイプのマイクを使用したい場合はプルダウンから選択してください。

Filename
　録音した音声を保存するwavファイルの名称です。デフォルトではrecordedと入力されているため、何も入力しないとrecorded.wavがプログラム実行フォルダに作成されます。同じ名前のwavファイルがすでにある場合は上書きされてしまうので注意しましょう。拡張子の入力は不要です。

　設定が終了したら青くなっているRecordボタンをクリックすることで録音が開始されます。録音中はボタンのラベルがStopに変わり色も赤くなりますが、このStopボタンをクリックすると指定秒数以内であっても録音を停止させることができます。

　こちらが録音機能のデモ動画です。録音中もリアルタイムにスペクトログラムの更新を確認できます。録音後に出てくるウィンドウをOKボタンで閉じると、メインの波形にデータが渡されます（画面の説明はwav読み込みの後）。

　リアルタイムにスペクトログラムを表示する方法は以下の記事で紹介しているので、気になる方は是非記事を読んでみてください。

参考）リアルタイムスペクトログラム更新のコード

・Pythonで録音した音声をリアルタイムにスペクトログラム表示する

wavファイル読み込み機能

　次はwavファイル読み込み機能の紹介です。メイン画面のLoad WAVボタンをクリックします。

　既に保存されたwavファイルを選択し、開きます（画面はPC毎に異なると思います）。

グラフの説明

　録音したりwavファイルを読み込むと、メイン画面の3つのグラフにデータが入ります。このそれぞれの波形は次の画像の通り、時間波形（時間[s]×振幅[Lin]）・スペクトログラム（時間[s]×周波数[Hz]×振幅[dB]）・FFT波形（時間[s]×振幅[dB]）です。

音声再生機能（カーソル範囲指定と再生速度変更）

　カーソルで指定した範囲で音声を再生できます。カーソルは赤（初期位置）と青（停止位置）があり、Playボタンをクリックすることでこの範囲の音声が再生されます。再生途中でStopボタンをクリックすると、再生は途中で止まります。再度PlayボタンをクリックすることでStopしたところから再生を始めます。

　赤カーソルの位置が音声再生と連動して進行しますが、このカーソルの位置でリアルタイムにFFT波形が更新されます（マウスでドラッグしている時も更新されています）。

　また、Speed Factorの欄に0.1〜10までの再生速度を入力することができます。例えば0.5であれば半分の速度、2.0であれば倍速再生が可能です。

　こちらの動画は音声再生のデモ動画です（音声はwatのくちぶえ）。説明を読むより動画を見た方がはやいかもしれません。ちなみに再生中に色々な操作をするとどうなるかといった細かいバグ出しはまだしていないので、不具合が出たらアプリを落としたり、処理が完了するまで待ちましょう（ここは筆者怠慢ですがご容赦ください）。

グラフの軸設定変更

　時間波形、FFT波形、スペクトログラムの3つのグラフは軸設定を変更できます。それぞれのグラフ内任意の部分をダブルクリックすると、次の軸設定ウィンドウが表示されます。XYZ軸それぞれの値を任意の数に変更したら、OKで適用されます。

　変更した軸設定はいつでもHomeボタンをクリックすることでデフォルトに戻すことができます。

　こちらが軸設定変更のデモ動画です。

スペクトログラム情報取得（矩形範囲ドラッグ計算）

　最後はスペクトログラムからの情報を取得する方法についてです。スペクトログラムは時間・周波数・振幅と多くの情報を持っていますが、音声分析の分野で最も頻出するのが最大値の取得とO.A.（オーバーオール）の取得だと思います。今回はこれらを直感的に取得できるよう、ドラッグ操作による計算を実装しました。

　まず、録音直後やwavファイル読み込み直後は、スペクトログラム全体の中で最大振幅値・最大振幅時の周波数・O.A.の情報が画面右上に表示されています。ここで、全体から計算した情報であることを示すために(whole)と記載しています。

　スペクトログラム内でマウスをドラッグすると黄色い四角形が描画できます。この四角形が計算範囲を示し、マウスボタンをリリースすると右上の情報が選択範囲のものに変化します。全体からの計算と区別するために、(sel)という文字がつきます。

　こちらがドラッグ操作による範囲指定デモ動画です。

まとめ

　今回は年末年始の連休を使って録音機能付きwav音声分析ソフトをつくってみました。初めてのPyQt5によるGUIアプリですが、ChatGPTを使いながらであればどんどん進んでいきました。ただ、あくまで趣味的に爆速で作成したソフトであるため細かいバグや計算が理論値と異なるといった不具合は色々ありそうです。もしこのコードを皆さんが使う時は、最低限その辺の検証はご自分の責任で実施していただきたいと思います（そんな人いないと思いますが、会社で使って顧客に間違ったデータ出しちゃった…とかは気をつけてくださいね）。

　今後はWindowsやLinuxでの動作確認と、wat自身も計算の検証等を行なって最終版はGitHubにでも上げようと思います。では今年もよろしくお願いいたします。

直感的に操作できるGUIアプリがつくれました！
Xでも関連情報をつぶやいているので、wat(@watlablog)のフォローお待ちしています！

PyQt5で録音機能付きwav音声分析ソフトをつくってみた

本記事の概要

モチベーション

なぜPyQtを選択したか？

参考）wxPython関連の記事

参考）kivy関連の記事

Maplotlibの直感的な操作がやりたかった

動作環境

録音機能付きGUIアプリの概要

全コード

リアルタイム録音機能

参考）リアルタイムスペクトログラム更新のコード

wavファイル読み込み機能

グラフの説明

音声再生機能（カーソル範囲指定と再生速度変更）

グラフの軸設定変更

スペクトログラム情報取得（矩形範囲ドラッグ計算）

まとめ

コメントを残すコメントをキャンセル

最近の投稿

アーカイブ

カテゴリー

本記事の概要

モチベーション

なぜPyQtを選択したか？

参考）wxPython関連の記事

参考）kivy関連の記事

Maplotlibの直感的な操作がやりたかった

動作環境

録音機能付きGUIアプリの概要

全コード

リアルタイム録音機能

参考）リアルタイムスペクトログラム更新のコード

wavファイル読み込み機能

グラフの説明

音声再生機能（カーソル範囲指定と再生速度変更）

グラフの軸設定変更

スペクトログラム情報取得（矩形範囲ドラッグ計算）

まとめ

SNSでもご購読できます。

コメントを残す コメントをキャンセル

最近の投稿

アーカイブ

カテゴリー

コメントを残すコメントをキャンセル