Error: Contact form not found.
Related articles
激动人心的一天!开始在这里记录生活的点滴,向前看。
Written by allen
The Youth’s Companion, Feb.7 1889, p.73(Vol 62) JUST THE BOY WANTED, II IN THE LAW, by Judge Oliver Wendell Holmes (from Howe, Mark DeWolfe. Research materials relating to life of Oliver Wendell Holmes.) A boy who wants to succeed in the law will probably do so. An encouraging thought, as far as it goes. But […]
Written by allen
Huh HUH is typically used as a slang word, with the meaning “I am Confused or Surprised” or “Do you Understand?”. 嗯哼? Duh DUH is an ironic response to a question or statement, implying that the speaker is stupid or that the reply is obvious. 显而易见~ 废话~
Written by allen
Attic 在wikipedia中译为“阁楼”。 近两年进入attic的apache项目有不少:sqoop,ambri… 什么时候项目会进入attic? PMC决定不在维护了,或者ASF建议PMC将项目移进attic。 新技术不断涌现,替代者的出现是这些项目逐渐成为历史的直接原因。例如sqoop被datax、flink或spark(seatunnel)数据集成平台、kafka connect所替代。 结语 花无百日红,软件项目的生命周期也一样。优胜劣汰,自然社会进化的亘古不变的规律。
Written by allen
As you said in the last video of 《machine learning》, I hope we can use AI to build cool products and make a better life. Thank you Andrew ng.
Written by allen
Written by allen
简述 hudi,重度依赖spark做了一套table format的设计和文件管理,解决批流存储统一的问题。所有元数据,如果不依赖metastore都是放在文件中(它自身设计了一个简单的metastore在1.0版本,当前主要与hive metastore结合可以做湖仓,虽然不是必须项,用于其它计算引擎的外部表查询)。 文件众多,会有小文件管理问题,所以建议使用至少需要有spark3环境。 构建 Hudi的meta data使用hfile格式,存储文件信息,避免遍历obs的消耗。问题是源码使用2.4.9版本的hbase,默认基于hadoop2.x版本,如果使用hudi-bundle.jar在hadoop3.x环境运行,会报意向不到的一些异常,例如类文件找不到。 Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics; 解决办法是自己编译hbase2.4.9,指定hadoop.profile=3,再编译hudi。 注意,hbase有一些坑,windows环境很多命令不支持,不安装相应shell命令的情况下,可以注释无用的一些exec-maven-plugin,主要是用来做校验之类的工作。 hudi编译 参考资料: HUDI-META-HBASE ISSUE
Written by allen
No Comments
Leave a comment Cancel