• 隐藏侧边栏
  • 展开分类目录
  • 关注微信公众号
  • 我的GitHub
  • QQ:1753970025
Chen Jiehua

Ubuntu 16.04 那些坑 

Ubuntu 16.04 LTS正式版刚发布,就迫不及待的要来重装系统了。虽然新系统不错,但是部分软件跟系统之间却出现了一些小坑……

update-rc.d

在安装 supervisor 时候采用 apt-get 方式正常安装,配置管理其他进程也都一切正常。然而,重启系统之后却发现 supervisor 里面的程序都没有启动,仔细一看,就连 supervisord 也没有随系统正常启动。挺郁闷的,14.04 中 apt-get 安装supervisor后都没出现过这个问题。

查看 /etc/init.d/supervisor 正常,可以通过 sudo service supervisor start 正常启动;再查看 /etc/rcX.d 各个目录,也发现有 S/KNNsupervisor ,不过却发现 NN=01,这里就有点异常了。

# 移除启动项
$ sudo update-rc.d supervisor remove
# 重新添加
$ sudo update-rc.d supervisor defaults
# 结果还是一样 NN=01,怎么supvisor默认的启动顺序会怎么靠前呢?
# 在 ubuntu 14.04 中,/etc/rc2.d,NN=20
lrwxrwxrwx 1 root root 20 Jul 16 2015 S20supervisor -> ../init.d/supervisor*

查看 update-rc.d 文档,对比 14.04 发现了一些区别:

# ubuntu 14.04
NAME
       update-rc.d - install and remove System-V style init script links

SYNOPSIS
       update-rc.d [-n] [-f] name remove
       update-rc.d [-n] name defaults [NN | SS KK]
       update-rc.d [-n] name start|stop NN runlevel [runlevel]...  .  start|stop NN runlevel [runlevel]...  . ...
       update-rc.d [-n] name disable|enable [ S|2|3|4|5 ]



# ubuntu 16.04
$ man update-rc.d
……
SYNOPSIS
       update-rc.d [-n] [-f] name remove
       update-rc.d [-n] name defaults
       update-rc.d [-n] name disable|enable [ S|2|3|4|5 ]
……

可以看到 update-rc.d name defaults 命令发生了变化,同时 start/stop 也没有了。继续查看文档:

# ubuntu 16.04

INSTALLING INIT SCRIPT LINKS
       update-rc.d  requires  dependency and runlevel information to be provided in the init.d script LSB comment header of all init.d
       scripts.  See the insserv(8) manual page for details about the LSB header format.

       When run with the defaults option, update-rc.d  makes  links  named  /etc/rcrunlevel.d/[SK]NNname  that  point  to  the  script
       /etc/init.d/name, using runlevel and dependency information from the init.d script LSB comment header.

       If  any files named /etc/rcrunlevel.d/[SK]??name already exist then update-rc.d does nothing.  The program was written this way
       so that it will never change an existing configuration, which may have been customized by the system administrator.   The  pro‐
       gram will only install links if none are present, i.e., if it appears that the service has never been installed before.

       Older versions of update-rc.d also supported start and stop options.  These options are no longer supported, and are now equiv‐
       alent to the defaults option.

       A common system administration error is to delete the links with the thought that this will "disable" the service,  i.e.,  that
       this  will  prevent  the service from being started.  However, if all links have been deleted then the next time the package is
       upgraded, the package's postinst script will run update-rc.d again and this will reinstall links at their factory default loca‐
       tions.   The  correct way to disable services is to configure the service as stopped in all runlevels in which it is started by
       default.  In the System V init system this means renaming the service's symbolic links from S to K.

       The script /etc/init.d/name must exist before update-rc.d is run to create the links.

也就是说 update-rc.d 默认根据 /etc/init.d 中脚本内定义的依赖和 runlevel 来进行设置,而不再提供 NN让我们手动指定启动顺序,查看 /etc/init.d/supervisor:

### BEGIN INIT INFO
# Provides:          supervisor
# Required-Start:    $remote_fs $network $named
# Required-Stop:     $remote_fs $network $named
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start/stop supervisor
# Description:       Start/stop supervisor daemon and its configured
#                    subprocesses.
### END INIT INFO

Required-Start和 Required-Stop 两项应该是有点问题,具体不懂,参考了 redis-server 的启动脚本:

### BEGIN INIT INFO
# Provides:             redis-server
# Required-Start:       $syslog $remote_fs
# Required-Stop:        $syslog $remote_fs
# Should-Start:         $local_fs
# Should-Stop:          $local_fs
# Default-Start:        2 3 4 5
# Default-Stop:         0 1 6
# Short-Description:    redis-server - Persistent key-value db
# Description:          redis-server - Persistent key-value db
### END INIT INFO

这里应该是缺少了 syslog,在 /etc/init.d/supervisor 中添加上 $syslog,然后再设置开机启动:

$ sudo update-rc.d supervisor defaults
# 看到 /etc/rc2.d
lrwxrwxrwx 1 root root 20 4月 22 01:18 S02supervisor -> ../init.d/supervisor
lrwxrwxrwx 1 root root 22 4月 21 23:40 S02redis-server -> ../init.d/redis-server

java

折腾kafka时候需要依赖java环境,一看 Ubuntu 16.04 提供了 openjdk-9-jre,那就 apt-get 安装一个。安装完成后启动 kafka,却报错:

Unrecognized VM option 'PrintGCDateStamps'

难道 jre 安装有问题?

$ java -version
openjdk version "9-internal"
OpenJDK Runtime Environment (build 9-internal+0-2016-04-14-195246.buildd.src)
OpenJDK 64-Bit Server VM (build 9-internal+0-2016-04-14-195246.buildd.src, mixed mode)

$ update-alternatives --display java
java - 自动模式
 链接目前指向 /usr/lib/jvm/java-9-openjdk-amd64/jre/bin/java

看起来也没什么问题呀,难道是openjdk的问题?那我就安装一下 oracle jre:

$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java9-installer

安装完成后,运行kafka还是报错。。。Google了一下 “PrintGCDateStamps”参数,发现有人遇到类似的问题,作者的解决方案是直接注释掉了kafka目录下的 ./bin/kafka-run-class.sh:

# GC options
GC_FILE_SUFFIX='-gc.log'
GC_LOG_FILE_NAME=''
if [ "x$GC_LOG_ENABLED" = "xtrue" ]; then
  GC_LOG_FILE_NAME=$DAEMON_NAME$GC_FILE_SUFFIX
  # 注释掉下面的两个 PrintGCxxxStamps 参数
  KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "
fi

虽然总感觉有点别扭,但是总得让kafka跑起来先。

然后要跑 Apache Spark,执行 ./bin/pyspark,却又是java报错:

Python 2.7.11+ (default, Apr 17 2016, 14:00:29) 
[GCC 5.3.1 20160413] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Exception in thread "main" java.lang.NumberFormatException: For input string: "9-ea"
	at java.lang.NumberFormatException.forInputString(java.base@9-ea/NumberFormatException.java:65)
	at java.lang.Integer.parseInt(java.base@9-ea/Integer.java:695)
	at java.lang.Integer.parseInt(java.base@9-ea/Integer.java:813)
	at org.apache.spark.launcher.CommandBuilderUtils.addPermGenSizeOpt(CommandBuilderUtils.java:326)
	at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:223)
	at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:121)
	at org.apache.spark.launcher.Main.main(Main.java:86)
Traceback (most recent call last):
  File "/home/jachua/software/spark/spark-1.6.1-bin-hadoop2.6/python/pyspark/shell.py", line 43, in <module>
    sc = SparkContext(pyFiles=add_files)
  File "/home/jachua/software/spark/spark-1.6.1-bin-hadoop2.6/python/pyspark/context.py", line 112, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File "/home/jachua/software/spark/spark-1.6.1-bin-hadoop2.6/python/pyspark/context.py", line 245, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File "/home/jachua/software/spark/spark-1.6.1-bin-hadoop2.6/python/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number

看了一下 java 的版本号:

$ java -version
java version "9-ea"
Java(TM) SE Runtime Environment (build 9-ea+116)
Java HotSpot(TM) 64-Bit Server VM (build 9-ea+116, mixed mode)

略蛋疼,竟然会是版本号字符串问题。。。想想要不就装 jdk-8 的试试看:

# openjdk 和 oracle jdk一起安装算了,一个一个测试
$ sudo apt-get install openjdk-8-jre openjdk-8-jdk
$ sudo apt-get install oracle-java8-installer

看看现在系统里面已经安装的jdk:

$ sudo update-alternatives --display config
java - 手动模式
  link best version is /usr/lib/jvm/java-8-oracle/jre/bin/java
 链接目前指向 /usr/lib/jvm/java-9-oracle/bin/java
  link java is /usr/bin/java
  slave java.1.gz is /usr/share/man/man1/java.1.gz
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java - 优先级 1081
  次要 java.1.gz:/usr/lib/jvm/java-8-openjdk-amd64/jre/man/man1/java.1.gz
/usr/lib/jvm/java-8-oracle/jre/bin/java - 优先级 1094
  次要 java.1.gz:/usr/lib/jvm/java-8-oracle/man/man1/java.1.gz
/usr/lib/jvm/java-9-openjdk-amd64/bin/java - 优先级 1091
  次要 java.1.gz:/usr/lib/jvm/java-9-openjdk-amd64/man/man1/java.1.gz
/usr/lib/jvm/java-9-oracle/bin/java - 优先级 1093

# 或者也可以直接看目录
$ ls -al /usr/lib/jvm
lrwxrwxrwx   1 root root   20 4月  23 16:26 java-1.8.0-openjdk-amd64 -> java-8-openjdk-amd64
lrwxrwxrwx   1 root root   20 4月  15 04:09 java-1.9.0-openjdk-amd64 -> java-9-openjdk-amd64
drwxr-xr-x   7 root root 4096 5月  16 00:50 java-8-openjdk-amd64
drwxr-xr-x   8 root root 4096 5月  16 00:59 java-8-oracle
drwxr-xr-x   6 root root 4096 4月  27 21:51 java-9-openjdk-amd64
drwxr-xr-x   8 root root 4096 5月  15 23:33 java-9-oracle
$ java -version

选择不同版本的java:

$ sudo update-alternatives --config java

选择 openjdk-8:

$ java -version
openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-0ubuntu4~16.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

选择 oracle-java-8:

$ java -version
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)

然后测试了 Spark,运行正常。再回头看看一开始 kafka 的问题,openjdk version 是 “9-internal”,看来一切都是jdk-9导致的坑爹。

码字很辛苦,转载请注明来自ChenJiehua《Ubuntu 16.04 那些坑》

评论