采集网易云音乐排行榜入库并将封面、mp3上传到oss
<p>网易云音乐排行榜地址:<a href="https://music.163.com/#/discover/toplist">https://music.163.com/#/discover/toplist</a></p><p><img alt="image.png" src="data/attachment/forum/202302/23/2023-02-23_16-49-40_378.png" title="15904763121590476312_1668.png" /></p>
<p> </p>
<p>采集内容:歌曲名称、歌手、歌曲时长、歌曲封面、歌曲本身mp3、歌曲id</p>
<p>经过分析:歌曲列表里的内容在网页上分为歌曲名称、属性两大部分组成,由页面js通过歌曲id对应上,再展示到页面,即你看到的页面。</p>
<p>由于下载歌曲封面、歌曲mp3并再上传到oss可能会导致超时,所以分为两部分:一部分只取歌曲的基本信息,另一个程序根据基本信息取封面、mp3下载并上传到oss。</p>
<p> </p>
<p>1、采集基本信息并入库:</p>
<pre>
public function mp3()
{
set_time_limit(0);
$caiji_list = 'https://music.163.com/discover/toplist';
$html = curl_($caiji_list);//自定义写的获取html信息函数,也可以直接用file_get_contents替换
preg_match('|(?<=<ul class="f-hide">)(.*?)(?=</ul>)|', $html, $matches);
if(empty($matches))
{
echo '没有获取到数据';
return ;
}
$html2 = $matches;
unset($matches);
preg_match_all('/((id=(\d)+(?="))|((?<=">)(.*?)(?=<)))/', $html2, $matches);
if(empty($matches))
{
echo '获取li里数据时错误';
return ;
}
$lis = $matches;
unset($matches);
$len = count($lis)/2;
for($i=0; $i<$len; $i++)
{
$j = $i * 2 + 1;
$k = $i * 2;
$data[$lis[$k]] = $lis[$j];
}
if(empty($data))
{
echo '组装好的$data数据为空';
return ;
}
//dd($data);// 示例数据: => 遇到, 入库即可
$dbmusic = Db::name('music');
foreach($data as $k => $v)
{
$tmp = explode('=', $k);
$id = $tmp ?? 0;
$row = $dbmusic->where('songid', $id)->find();
if(empty($row['songid']))
{
$insert_data = [
'title' => $v,
'songid'=> $id,
];
$dbmusic->insert($insert_data);
}
}
preg_match('/<textarea id="song-list-pre-data" style="display:none;">(.*?)<\/textarea>/',$html, $matches);
if(empty($matches))
{
echo '无法匹配到歌曲其它信息';
return ;
}
$song_other = $matches;
unset($matches);
$json = json_decode($song_other, true);
foreach($json as $k => $v)
{
$id = $v['id'];
$duration = intval($v['duration']/100);
$artistName = $v['artists']['name'] ?? '';
$update = [
'duration' => $duration,
'artistName'=> $artistName,
];
$dbmusic->where('songid', $id)->update($update);
}
echo $caiji_list . '基础数据已经入库';
return;
}</pre>
<p>2、根据基本信息获取图片、mp3下载到本地并上传到oss:</p>
<pre>
//采集图片、下载mp3
public function mp33()
{
set_time_limit(0);
$dbmusic = Db::name('music');
$rows = $dbmusic->where('songid', '<>', 0)->where('playPath', '')->select();
foreach($rows as $k => $v)
{
$urlimg = 'https://music.163.com/song?id='.$v['songid'];
$html = curl_($urlimg);
preg_match('/(?<=<img src=")(.*?)(?=" class="j-img")/', $html, $match);
if(empty($match))
{
continue;
}
$remote_img_url = $match;
unset($match);
$img_short_name = $v['songid'].'.jpg';
$filename = TEMP_PATH . $img_short_name;
file_put_contents($filename, file_get_contents($remote_img_url));
if(is_file($filename))
{
$ret = upload_file_to_oss('home/music/20200523/', $filename);
$url = $ret['code'] == 200 ? $ret['data'] : '';
if($url)
{
$dbmusic->where('songid', $v['songid'])->update(['image'=>$url, 'create_time'=>time()]);
@unlink($filename);
}
}
//下载mp3
$download_url = 'http://music.163.com/song/media/outer/url?id='.$v['songid'].'.mp3';
$mp3_short_name = $v['songid'].'.mp3';
$mp3filename = TEMP_PATH . $mp3_short_name;
file_put_contents($mp3filename, file_get_contents($download_url));
if(is_file($mp3filename))
{
$ret = upload_file_to_oss('home/music/20200523/', $mp3filename);
$url = $ret['code'] == 200 ? $ret['data'] : '';
if($url)
{
$dbmusic->where('songid', $v['songid'])->update(['playPath' => $url]);
@unlink($mp3filename);
}
}
}
echo '数据已经更新,请查看';
return ;
}</pre>
<p>说实话,这些排行榜的音乐不符合我的胃口,听着想睡觉,没有抖音的音乐有劲、有情调。我基本上算是个抖音迷,每天都要刷刷。对于音乐,我喜欢我所喜欢的经典音乐。并非所有音乐我都会听。</p>
<p><code style="background-color:#e6e6fa; border-left:5px solid #ff0000; display:block; padding:5px 5px 5px 5px">本文只是练习正则来匹配自己想要的结果,仅此而已。如果本文侵犯了您的版权,可以联系我进行删除。</code></p>
页:
[1]