在Bit-Brick Cluster K1使用distcc集群编译

在Bit-Brick Cluster K1使用distcc集群编译

基础环境

  • Bit-Brick Cluster K1

  • 千兆网线

Bit-Brick Cluster K1 是一款高性能计算扩展设备,通过集成多个核心板构建计算集群,显著提升系统算力,满足高计算负载应用需求。它支持最多同时挂载 4 块核心板,并提供丰富的外设接口,便于连接各类外部设备,灵活扩展系统功能。

SSOM-K1 核心板配置如下:

  • 进迭时空 K1 RISC-V SoC

  • 8GB LPDDR4x

  • 64GB eMMC

整个集群总计32个进迭时空x60核心,每个核心主频1.6GHz。

编译时集群核心板为满载状态,建议用风扇被动式散热。

获取节点IP地址

使用USB串口连接到集群,分别获取B C D三个节点的IP地址:


192.168.1.21

192.168.1.22

192.168.1.18

安装distcc

在所有节点上安装distcc,请确保版本一致。


sudo apt update

sudo apt install gcc make libncurses-dev libssl-dev bc flex bison -y

sudo apt install distcc -y

配置distcc

在所有节点上配置distcc:


sudo vim /etc/default/distcc

修改以下内容:


STARTDISTCC="true"

ALLOWEDNETS="192.168.1.0/24" # 允许主控机网段,可根据网络环境修改

LISTENER="0.0.0.0" # 监听所有IP

JOBS="$(nproc)" # 使用全部CPU核心,可根据实际需求修改

在主节点上配置distcc:


ALLOWEDNETS="127.0.0.1 192.168.1.0/24" # 允许主控机网段,可根据网络环境修改

LISTENER="127.0.0.1" # 只监听本地IP

开放distcc端口:


sudo ufw allow 3632/tcp

启动distcc服务:


sudo systemctl restart distcc

配置环境变量,添加主节点和从节点IP地址:


export DISTCC_HOSTS="192.168.1.18 192.168.1.21 192.168.1.22 localhost"

在主节点上查看distcc连接状态:


# 测试distcc连接

distcc --show-hosts

此时应该可以看到主节点和从节点的IP地址,如果没有看到从节点的IP地址,可以尝试重启distcc服务,并尝试如下命令验证主节点和从节点是否可以互相通信:


# 根据网络环境修改IP地址,并替换以下命令中的IP地址

for host in 192.168.1.18 192.168.1.21 192.168.1.22; do

echo "测试 $host ..."

nc -zv $host 3632

distcc --version -h $host

done

正常输出如下:


测试 192.168.1.18 ...

Connection to 192.168.1.18 3632 port [tcp/distcc] succeeded!

distcc 3.4 riscv64-unknown-linux-gnu

(protocols 1, 2 and 3) (default port 3632)

built Apr 1 2024 05:42:12

Copyright (C) 2002, 2003, 2004 by Martin Pool.

Includes miniLZO (C) 1996-2002 by Markus Franz Xaver Johannes Oberhumer.

Portions Copyright (C) 2007-2008 Google.

distcc comes with ABSOLUTELY NO WARRANTY. distcc is free software, and

you may use, modify and redistribute it under the terms of the GNU

General Public License version 2 or later.

Built with Zeroconf support.

Built with GSS-API support for mutual authentication.

Please report bugs to distcc@lists.samba.org

测试 192.168.1.21 ...

Connection to 192.168.1.21 3632 port [tcp/distcc] succeeded!

distcc 3.4 riscv64-unknown-linux-gnu

(protocols 1, 2 and 3) (default port 3632)

built Apr 1 2024 05:42:12

Copyright (C) 2002, 2003, 2004 by Martin Pool.

Includes miniLZO (C) 1996-2002 by Markus Franz Xaver Johannes Oberhumer.

Portions Copyright (C) 2007-2008 Google.

distcc comes with ABSOLUTELY NO WARRANTY. distcc is free software, and

you may use, modify and redistribute it under the terms of the GNU

General Public License version 2 or later.

Built with Zeroconf support.

Built with GSS-API support for mutual authentication.

Please report bugs to distcc@lists.samba.org

测试 192.168.1.22 ...

Connection to 192.168.1.22 3632 port [tcp/distcc] succeeded!

distcc 3.4 riscv64-unknown-linux-gnu

(protocols 1, 2 and 3) (default port 3632)

built Apr 1 2024 05:42:12

Copyright (C) 2002, 2003, 2004 by Martin Pool.

Includes miniLZO (C) 1996-2002 by Markus Franz Xaver Johannes Oberhumer.

Portions Copyright (C) 2007-2008 Google.

distcc comes with ABSOLUTELY NO WARRANTY. distcc is free software, and

you may use, modify and redistribute it under the terms of the GNU

General Public License version 2 or later.

Built with Zeroconf support.

Built with GSS-API support for mutual authentication.

Please report bugs to distcc@lists.samba.org

在distcc 3.4版本中有一个bug,会导致在RISC-V架构下报错,需要在每个节点执行以下命令:


export DISTCC_NO_REWRITE_CROSS=1dcc_gcc_rewrite_fqn

具体问题请查看:Bug: Buffer Overflow Detected on aarch64 with distcc 3.4. · Issue #546 · distcc/distcc · GitHub

该问题预计将在distcc 3.5版本中修复。

编译

一般可以在主节点上执行以下命令编译C程序:


make -j$(nproc) CC=distcc

请根据实际需求修改编译命令,接下来用Linux内核编译为例:

安装编译依赖:


sudo apt-get install debhelper libpfm4-dev libtraceevent-dev asciidoc libelf-dev devscripts git

克隆源码:


git clone https://gitee.com/bianbu-linux/linux-6.6.git --depth=1

cd linux-6.6

生成配置文件:


make k1_defconfig

通过以下脚本编译内核:


vim distcc_build.sh


#!/bin/bash

NUM_JOBS=$(( $(distcc -j) / 2 )) # 计算总任务数

make -j$NUM_JOBS CC="distcc gcc" \

CXX="distcc g++" \

CPP="distcc cpp" \

KBUILD_BUILD_TIMESTAMP='' # 避免时间戳警告

执行脚本:


chmod +x distcc_build.sh

bash distcc_build.sh

也可以直接在主节点上执行以下命令编译内核:

常见问题

如果出现以下报错说明主节点内存不足,可以尝试设置swap分区,或者减少job数量。


[ 5860.732132] Out of memory: Killed process 84853 (cc1) total-vm:59900kB, anon-rss:18336kB, file-rss:896kB, shmem-rss:0kB, UID:1000 pgtables:124kB oom_score_adj:0

[ 5861.875207] Out of memory: Killed process 84772 (cc1) total-vm:60008kB, anon-rss:18372kB, file-rss:896kB, shmem-rss:0kB, UID:1000 pgtables:120kB oom_score_adj:0

[ 5863.718845] Out of memory: Killed process 84498 (cc1) total-vm:60056kB, anon-rss:18664kB, file-rss:512kB, shmem-rss:0kB, UID:1000 pgtables:124kB oom_score_adj:0

[ 5864.533358] Out of memory: Killed process 84593 (cc1) total-vm:58144kB, anon-rss:18044kB, file-rss:1024kB, shmem-rss:0kB, UID:1000 pgtables:116kB oom_score_adj:0

如果在编译过程中卡住出现以下报错说明集群节点间网络不通,可以尝试ping一下各个节点,或者检查防火墙设置。如果以上设置都没有问题,可能由系统问题导致,可以尝试重启集群节点。


distcc[92628] ERROR: failed to connect to 192.168.1.21:3632: Network is unreachable

distcc[92628] ERROR: failed to connect to 192.168.1.22:3632: Network is unreachable

distcc[92628] ERROR: failed to connect to 192.168.1.18:3632: Network is unreachable

如果在编译过程中报错:


*** buffer overflow detected ***: terminated

已中止(核心已转储)


Aborted (core dumped)

请尝试设置环境变量:


export DISTCC_NO_REWRITE_CROSS=1dcc_gcc_rewrite_fqn

3 个赞

【RISC-V集群编译!在Bit-Brick Cluster K1使用distcc集群编译Linux内核-哔哩哔哩】 RISC-V集群编译!在Bit-Brick Cluster K1使用distcc集群编译Linux内核_哔哩哔哩_bilibili