You usually don't need a script like the following. I just finished writing it. Its use makes sense for old video recordings with considerable volume changes.
The audio is normalized second by second instead of all at once. The script uses a maximum amplification value (35dB) to prevent every silence from becoming a din.
For this script to work, you must have "ffmpeg" and "sox" installed.
Happy hacking!
#!/bin/bash
input=input.mkv
output=output.mp4
audio=audio.wav
newaudio=combined.wav
ffmpeg -i "$input" -vn -ar 44100 -ac 2 "$audio"
ffmpeg -i "$audio" -f segment -segment_time 1 -c copy out%06d.wav
for f in out*.wav
do
# detects volume in decibel
MAX=$(ffmpeg -hide_banner -i "$f" -map 0:a -filter:a volumedetect -f null /dev/null 2>&1 | grep 'max_volume' | awk '{print $5}')
# removes the minus sign (only if $MAX starts with a "-" (wildcard matching))
if [[ $MAX == -* ]]; then MAX="${MAX:1}"; fi
# set a maximum volume amplification
if (( $(echo "$MAX > 35.0" | bc -l) )); then MAX="35.0"; fi
echo $f" -> "$MAX
ffmpeg -i $f -af "volume="$MAX"dB" max$f
done
# Before merging the audio files with sox, we need to set up an high max number of files to be concatenated
ulimit -n 16384 # https://www.spinics.net/lists/sox-users/msg00167.html
sox maxout*.wav $newaudio
rm *out*.wav
rm $audio
# now we replace the old audio with the new audio (https://superuser.com/a/1137613)
ffmpeg -i "$input" -i "$newaudio" -c:v copy -map 0:v:0 -map 1:a:0 "$output"
rm $newaudio