ios – 从音频文件中提取仪表级别

我需要从文件中提取音频表级别,以便在播放音频之前渲染级别.我知道AVAudioPlayer可以在播放音频文件时获取此信息
func averagePower(forChannel channelNumber: Int) -> Float.

但在我的情况下,我希望事先得到一个米级的[Float].

解决方法

更快的解决方案

它需要一个iPhone 6s加:

> 0.538秒处理8MByte mp3播放器,持续时间为4分47秒,采样率为44,100
> 0.170s处理712KByte MP3播放器,持续时间为22s,100
> 0.089s来处理通过在终端中使用此命令afconvert -f caff -d LEI16 audio.mp3 audio.caf转换上面的文件而创建的caffile.

让我们开始:

A)声明此类将保存有关音频资产的必要信息:

/// Holds audio information used for building waveforms
final class AudioContext {

    /// The audio asset URL used to load the context
    public let audioURL: URL

    /// Total number of samples in loaded asset
    public let totalSamples: Int

    /// Loaded asset
    public let asset: AVAsset

    // Loaded assetTrack
    public let assetTrack: AVAssetTrack

    private init(audioURL: URL,totalSamples: Int,asset: AVAsset,assetTrack: AVAssetTrack) {
        self.audioURL = audioURL
        self.totalSamples = totalSamples
        self.asset = asset
        self.assetTrack = assetTrack
    }

    public static func load(fromAudioURL audioURL: URL,completionHandler: @escaping (_ audioContext: AudioContext?) -> ()) {
        let asset = AVURLAsset(url: audioURL,options: [AVURLAssetPreferPreciseDurationAndTimingKey: NSNumber(value: true as Bool)])

        guard let assetTrack = asset.tracks(withMediaType: AVMediaType.audio).first else {
            fatalError("Couldn't load AVAssetTrack")
        }

        asset.loadValuesAsynchronously(forKeys: ["duration"]) {
            var error: NSError?
            let status = asset.statusOfValue(forKey: "duration",error: &error)
            switch status {
            case .loaded:
                guard
                    let formatDescriptions = assetTrack.formatDescriptions as? [CMAudioFormatDescription],let audioFormatDesc = formatDescriptions.first,let asbd = CMAudioFormatDescriptionGetStreamBasicDescription(audioFormatDesc)
                    else { break }

                let totalSamples = Int((asbd.pointee.mSampleRate) * Float64(asset.duration.value) / Float64(asset.duration.timescale))
                let audioContext = AudioContext(audioURL: audioURL,totalSamples: totalSamples,asset: asset,assetTrack: assetTrack)
                completionHandler(audioContext)
                return

            case .failed,.cancelled,.loading,.unknown:
                print("Couldn't load asset: \(error?.localizedDescription ?? "Unknown error")")
            }

            completionHandler(nil)
        }
    }
}

我们将使用其异步函数load,并将其结果处理为完成处理程序.

B)在视图控制器中导入AVFoundation并加速:

import AVFoundation
import Accelerate

C)在视图控制器中声明噪声级别(以dB为单位):

let noiseFloor: Float = -80

例如,任何小于-80dB的东西都将被视为静音.

D)以下功能采用音频上下文并产生所需的dB功率. targetSamples默认设置为100,您可以更改它以满足您的UI需求:

func render(audioContext: AudioContext?,targetSamples: Int = 100) -> [Float]{
    guard let audioContext = audioContext else {
        fatalError("Couldn't create the audioContext")
    }

    let sampleRange: CountableRange<Int> = 0..<audioContext.totalSamples/3

    guard let reader = try? AVAssetReader(asset: audioContext.asset)
        else {
            fatalError("Couldn't initialize the AVAssetReader")
    }

    reader.timeRange = CMTimeRange(start: CMTime(value: Int64(sampleRange.lowerBound),timescale: audioContext.asset.duration.timescale),duration: CMTime(value: Int64(sampleRange.count),timescale: audioContext.asset.duration.timescale))

    let outputSettingsDict: [String : Any] = [
        AVFormatIDKey: Int(kAudioFormatLinearPCM),AVLinearPCMBitDepthKey: 16,AVLinearPCMIsBigEndianKey: false,AVLinearPCMIsFloatKey: false,AVLinearPCMIsNonInterleaved: false
    ]

    let readerOutput = AVAssetReaderTrackOutput(track: audioContext.assetTrack,outputSettings: outputSettingsDict)
    readerOutput.alwaysCopiesSampleData = false
    reader.add(readerOutput)

    var channelCount = 1
    let formatDescriptions = audioContext.assetTrack.formatDescriptions as! [CMAudioFormatDescription]
    for item in formatDescriptions {
        guard let fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription(item) else {
            fatalError("Couldn't get the format description")
        }
        channelCount = Int(fmtDesc.pointee.mChannelsPerFrame)
    }

    let samplesPerPixel = max(1,channelCount * sampleRange.count / targetSamples)
    let filter = [Float](repeating: 1.0 / Float(samplesPerPixel),count: samplesPerPixel)

    var outputSamples = [Float]()
    var sampleBuffer = Data()

    // 16-bit samples
    reader.startReading()
    defer { reader.cancelReading() }

    while reader.status == .reading {
        guard let readSampleBuffer = readerOutput.copyNextSampleBuffer(),let readBuffer = CMSampleBufferGetDataBuffer(readSampleBuffer) else {
                break
        }
        // Append audio sample buffer into our current sample buffer
        var readBufferLength = 0
        var readBufferPointer: UnsafeMutablePointer<Int8>?
        CMBlockBufferGetDataPointer(readBuffer,&readBufferLength,nil,&readBufferPointer)
        sampleBuffer.append(UnsafeBufferPointer(start: readBufferPointer,count: readBufferLength))
        CMSampleBufferInvalidate(readSampleBuffer)

        let totalSamples = sampleBuffer.count / MemoryLayout<Int16>.size
        let downSampledLength = totalSamples / samplesPerPixel
        let samplesToProcess = downSampledLength * samplesPerPixel

        guard samplesToProcess > 0 else { continue }

        processSamples(fromData: &sampleBuffer,outputSamples: &outputSamples,samplesToProcess: samplesToProcess,downSampledLength: downSampledLength,samplesPerPixel: samplesPerPixel,filter: filter)
        //print("Status: \(reader.status)")
    }

    // Process the remaining samples at the end which didn't fit into samplesPerPixel
    let samplesToProcess = sampleBuffer.count / MemoryLayout<Int16>.size
    if samplesToProcess > 0 {
        let downSampledLength = 1
        let samplesPerPixel = samplesToProcess
        let filter = [Float](repeating: 1.0 / Float(samplesPerPixel),count: samplesPerPixel)

        processSamples(fromData: &sampleBuffer,filter: filter)
        //print("Status: \(reader.status)")
    }

    // if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown)
    guard reader.status == .completed || true else {
        fatalError("Couldn't read the audio file")
    }

    return outputSamples
}

E)render使用此函数对音频文件中的数据进行下采样,并转换为分贝:

func processSamples(fromData sampleBuffer: inout Data,outputSamples: inout [Float],samplesToProcess: Int,downSampledLength: Int,samplesPerPixel: Int,filter: [Float]) {
    sampleBuffer.withUnsafeBytes { (samples: UnsafePointer<Int16>) in
        var processingBuffer = [Float](repeating: 0.0,count: samplesToProcess)

        let sampleCount = vDSP_Length(samplesToProcess)

        //Convert 16bit int samples to floats
        vDSP_vflt16(samples,1,&processingBuffer,sampleCount)

        //Take the absolute values to get amplitude
        vDSP_vabs(processingBuffer,sampleCount)

        //get the corresponding dB,and clip the results
        getdB(from: &processingBuffer)

        //Downsample and average
        var downSampledData = [Float](repeating: 0.0,count: downSampledLength)
        vDSP_desamp(processingBuffer,vDSP_Stride(samplesPerPixel),filter,&downSampledData,vDSP_Length(downSampledLength),vDSP_Length(samplesPerPixel))

        //Remove processed samples
        sampleBuffer.removeFirst(samplesToProcess * MemoryLayout<Int16>.size)

        outputSamples += downSampledData
    }
}

F)反过来调用此函数获取相应的dB,并将结果剪辑为[noiseFloor,0]:

func getdB(from normalizedSamples: inout [Float]) {
    // Convert samples to a log scale
    var zero: Float = 32768.0
    vDSP_vdbcon(normalizedSamples,&zero,&normalizedSamples,vDSP_Length(normalizedSamples.count),1)

    //Clip to [noiseFloor,0]
    var ceil: Float = 0.0
    var noiseFloorMutable = noiseFloor
    vDSP_vclip(normalizedSamples,&noiseFloorMutable,&ceil,vDSP_Length(normalizedSamples.count))
}

G)最后你可以像这样得到音频的波形:

guard let path = Bundle.main.path(forResource: "audio",ofType:"mp3") else {
    fatalError("Couldn't find the file path")
}
let url = URL(fileURLWithPath: path)
var outputArray : [Float] = []
AudioContext.load(fromAudioURL: url,completionHandler: { audioContext in
    guard let audioContext = audioContext else {
        fatalError("Couldn't create the audioContext")
    }
    outputArray = self.render(audioContext: audioContext,targetSamples: 300)
})

不要忘记AudioContext.load(fromAudioURL :)是异步的.

该解决方案由William Entriken从this repo合成.所有的功劳归于他.

老解决方案

这是一个可用于预渲染音频文件的音量级别而无需播放的功能:

func averagePowers(audioFileURL: URL,forChannel channelNumber: Int,completionHandler: @escaping(_ success: [Float]) -> ()) {
    let audioFile = try! AVAudioFile(forReading: audioFileURL)
    let audioFilePFormat = audioFile.processingFormat
    let audioFileLength = audioFile.length

    //Set the size of frames to read from the audio file,you can adjust this to your liking
    let frameSizeToRead = Int(audioFilePFormat.sampleRate/20)

    //This is to how many frames/portions we're going to divide the audio file
    let numberOfFrames = Int(audioFileLength)/frameSizeToRead

    //Create a pcm buffer the size of a frame
    guard let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFilePFormat,frameCapacity: AVAudioFrameCount(frameSizeToRead)) else {
        fatalError("Couldn't create the audio buffer")
    }

    //Do the calculations in a background thread,if you don't want to block the main thread for larger audio files
    DispatchQueue.global(qos: .userInitiated).async {

        //This is the array to be returned
        var returnArray : [Float] = [Float]()

        //We're going to read the audio file,frame by frame
        for i in 0..<numberOfFrames {

            //Change the position from which we are reading the audio file,since each frame starts from a different position in the audio file
            audioFile.framePosition = AVAudioFramePosition(i * frameSizeToRead)

            //Read the frame from the audio file
            try! audioFile.read(into: audioBuffer,frameCount: AVAudioFrameCount(frameSizeToRead))

            //Get the data from the chosen channel
            let channelData = audioBuffer.floatChannelData![channelNumber]

            //This is the array of floats
            let arr = Array(UnsafeBufferPointer(start:channelData,count: frameSizeToRead))

            //Calculate the mean value of the absolute values
            let meanValue = arr.reduce(0,{$0 + abs($1)})/Float(arr.count)

            //Calculate the dB power (You can adjust this),if average is less than 0.000_000_01 we limit it to -160.0
            let dbPower: Float = meanValue > 0.000_000_01 ? 20 * log10(meanValue) : -160.0

            //append the db power in the current frame to the returnArray
            returnArray.append(dbPower)
        }

        //Return the dBPowers
        completionHandler(returnArray)
    }
}

你可以这样称呼它:

let path = Bundle.main.path(forResource: "audio.mp3",ofType:nil)!
let url = URL(fileURLWithPath: path)
averagePowers(audioFileURL: url,forChannel: 0,completionHandler: { array in
    //Use the array
})

使用仪器,此解决方案在1.2秒内使用高CPU,使用returnArray返回主线程大约需要5秒,在低电池模式下最多需要10秒.

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


当我们远离最新的 iOS 16 更新版本时,我们听到了困扰 Apple 最新软件的错误和性能问题。
欧版/美版 特别说一下,美版选错了 可能会永久丧失4G,不过只有5%的概率会遇到选择运营商界面且部分必须连接到iTunes才可以激活
一般在接外包的时候, 通常第三方需要安装你的app进行测试(这时候你的app肯定是还没传到app store之前)。
前言为了让更多的人永远记住12月13日,各大厂都在这一天将应用变灰了。那么接下来我们看一下Flutter是如何实现的。Flutter中实现整个App变为灰色在Flutter中实现整个App变为灰色是非常简单的,只需要在最外层的控件上包裹ColorFiltered,用法如下:ColorFiltered(颜色过滤器)看名字就知道是增加颜色滤镜效果的,ColorFiltered( colorFilter:ColorFilter.mode(Colors.grey, BlendMode.
flutter升级/版本切换
(1)在C++11标准时,open函数的文件路径可以传char指针也可以传string指针,而在C++98标准,open函数的文件路径只能传char指针;(2)open函数的第二个参数是打开文件的模式,从函数定义可以看出,如果调用open函数时省略mode模式参数,则默认按照可读可写(ios_base:in | ios_base::out)的方式打开;(3)打开文件时的mode的模式是从内存的角度来定义的,比如:in表示可读,就是从文件读数据往内存读写;out表示可写,就是把内存数据写到文件中;
文章目录方法一:分别将图片和文字置灰UIImage转成灰度图UIColor转成灰度颜色方法二:给App整体添加灰色滤镜参考App页面置灰,本质是将彩色图像转换为灰度图像,本文提供两种方法实现,一种是App整体置灰,一种是单个页面置灰,可结合具体的业务场景使用。方法一:分别将图片和文字置灰一般情况下,App页面的颜色深度是24bit,也就是RGB各8bit;如果算上Alpha通道的话就是32bit,RGBA(或者ARGB)各8bit。灰度图像的颜色深度是8bit,这8bit表示的颜色不是彩色,而是256
领导让调研下黑(灰)白化实现方案,自己调研了两天,根据网上资料,做下记录只是学习过程中的记录,还是写作者牛逼
让学前端不再害怕英语单词(二),通过本文,可以对css,js和es6的单词进行了在逻辑上和联想上的记忆,让初学者更快的上手前端代码
用Python送你一颗跳动的爱心
在uni-app项目中实现人脸识别,既使用uni-app中的live-pusher开启摄像头,创建直播推流。通过快照截取和压缩图片,以base64格式发往后端。
商户APP调用微信提供的SDK调用微信支付模块,商户APP会跳转到微信中完成支付,支付完后跳回到商户APP内,最后展示支付结果。CSDN前端领域优质创作者,资深前端开发工程师,专注前端开发,在CSDN总结工作中遇到的问题或者问题解决方法以及对新技术的分享,欢迎咨询交流,共同学习。),验证通过打开选择支付方式弹窗页面,选择微信支付或者支付宝支付;4.可取消支付,放弃支付会返回会员页面,页面提示支付取消;2.判断支付方式,如果是1,则是微信支付方式。1.判断是否在微信内支付,需要在微信外支付。
Mac命令行修改ipa并重新签名打包
首先在 iOS 设备中打开开发者模式。位于:设置 - 隐私&安全 - 开发者模式(需重启)
一 现象导入MBProgressHUD显示信息时,出现如下异常现象Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_MBProgressHUD", referenced from: objc-class-ref in ViewController.old: symbol(s) not found for architecture x86_64clang: error: linker command failed wit
Profiles >> 加号添加 >> Distribution >> "App Store" >> 选择 2.1 创建的App ID >> 选择绑定 2.3 的发布证书(.cer)>> 输入描述文件名称 >> Generate 生成描述文件 >> Download。Certificates >> 加号添加 >> "App Store and Ad Hoc" >> “Choose File...” >> 选择上一步生成的证书请求文件 >> Continue >> Download。
今天有需求,要实现的功能大致如下:在安卓和ios端实现分享功能可以分享链接,图片,文字,视频,文件,等欢迎大佬多多来给萌新指正,欢迎大家来共同探讨。如果各位看官觉得文章有点点帮助,跪求各位给点个“一键三连”,谢啦~声明:本博文章若非特殊注明皆为原创原文链接。