360SDN.COM

首页/Java/列表

java开源POI组件介绍:HWPF 和 XWPF

来源:  2017-10-24 10:41:18    评论:0点击:

HWPF 是 POI 支持 Word(97-2003) 的 Java 组件,支持读写Word文档,但是写功能目前只实现一部分;它也提供更早版本的Word6和Word95版本的简单的文本摘录功能。

XWPF是 POI 支持 Word 2007+ 的 Java组件,提供简单文件的读写功能;



 Apache POI操作Word概览

1概览

1.1 类包概览

  HWPF       à     Microsoft Word 97(-2007)  --.doc
  XWPF        à     Microsoft Word 2007          --.docx
         HWPF和XWPF提供了相同的功能,但两者之间没有共同的方法。
 
Package Content
org.apache.poi.hdf 遗留代码;Internal代码,不能直接使用
org.apache.poi.hwpf.model 遗留代码重构后的代码;Internal代码,不能直接使用
org.apache.poi.hwpf.usermodel 公共代码,主要的接口方法
org.apache.poi.hwpf.extractor 抽取(读取)Word内容信息
org.apache.poi.hwpf.converter Word-to-HTML, Word-to-FO(使用Apache FOP转换为PDF)
org.apache.poi.hwpf.dev 开发者使用
 
 
         HWPF入口类是HWPFDocument,当前(3.10)版本中参考了org.apache.poi.hwpf.modelorg.apache.poi.hwpf.usermodel包中的接口,后期版本中可能会不同的接口。
         XWPF入口类是XWPFDocument,从这个类中可以获取段落、图片、表格、标题等信息。
         下载包中的示例比较少,分别在examples下的HWPFXWPF中;另外,可以从SVN上获取部分测试示例代码(HWPFXWPF)。

1.2    HWPF简述

基础信息读取       使用org.apache.poi.hwpf.extractor.WordExtractor类,其接受input Stream或者HWPFDocument作为实例化参数,使用getText()方法获取全部paragraphs,或者使用getParagraphText()获取每个paragraph中的文字信息
特定信息读取       为获取特定的文字或元素,首先需要创建org.apahce.poi.hwpf.HWPFDocument实例,通过getRange()方法获取所有range,然后从中获取所有paragraphs,以及更小的元素信息。
 页眉页脚信息       获取文档的页眉页脚信息,首先需要创建org.apahce.poi.hwpf.HWPFDocument实例,然后创建org.apache.poi.hwpf.usermodel.HeaderStores实例,并将HWPFDocument实例作为参数传入。通过HeaderStores实例可以获取页眉、页脚、首页、奇数页、偶数页等信息。另外,HeaderStores可以移除页眉页脚中的宏命令。
修改文本信息       使用(RangeParagraphCharacterRun中的)insertBefore()insertAfter()方法可以修改相应文本信息。
其他一些测试示例,见SVN

1.3    XWPF简述

基础信息读取       通过org.apache.poi.xwpf.extractor.XWPFWordExtractor进行基础信息读取,其接受input steam或XWPFDocument作为实例化参数。调用其getText()方法可以获取所有paragraphs、tables以及headers等中的信息。
特定信息读取       为获取特定文字或元素,首先需要创建org.apache.poi.xwpf.XWPFDocument实例,利用实例选择感兴趣的IBodyElement要素(Table, Paragraph等),然后获得一个XWPFRun,最后从XWPFRun中获取文本或属性信息。
页眉页脚信息       获取文档的页眉页脚信息,首先需要创建org.apache.poi.xwpf.XWPFDocument实例,然后创建org.apache.poi.xwpf.usermodel.XWPFHeaderFooter实例,并将HWPFDocument实例作为参数传入。通过XWPFHeaderFooter实例可以获取页眉、页脚、首页、奇数页、偶数页等信息。
修改文本信息       XWPFParagraph中,可以获取组成文本信息的XWPFRun要素。如果要添加新文本,调用createRun()方法或者文本末端添加一个XWPFRun要素,insertNewRun(int)可以paragraph的指定位置添加一个XWPFRun要素。一旦有了XWPFRun,可以调用其setText(String)方法修改文本内容,如果想增加一个空白要素,如tabs或则line breaks,需要调用addTab()addCarriageReturn()方法。
其他一些测试示例,见SVN



来源:https://poi.apache.org/document/quick-guide.html


 

POI-XWPF - A Quick Guide

XWPF has a fairly stable core API, providing read and write access to the main parts of a Word .docx file, but it isn't complete. For some things, it may be necessary to dive down into the low level XMLBeans objects to manipulate the ooxml structure. If you find yourself having to do this, please consider sending in a patch to enhance that, see the "Contribution to POI" page.

 

Basic Text Extraction

For basic text extraction, make use of org.apache.poi.xwpf.extractor.XWPFWordExtractor. It accepts an input stream or a XWPFDocument. The getText() method can be used to get the text from all the paragraphs, along with tables, headers etc.

 

Specific Text Extraction

To get specific bits of text, first create a org.apache.poi.xwpf.XWPFDocument. Select the IBodyElement of interest (Table, Paragraph etc), and from there get a XWPFRun. Finally fetch the text and properties from that.

 

Headers and Footers

To get at the headers and footers of a word document, first create a org.apache.poi.xwpf.XWPFDocument. Next, you need to create a org.apache.poi.xwpf.usermodel.XWPFHeaderFooter, passing it your XWPFDocument. Finally, the XWPFHeaderFooter gives you access to the headers and footers, including first / even / odd page ones if defined in your document.

 

Changing Text

From a XWPFParagraph, it is possible to fetch the existing XWPFRun elements that make up the text. To add new text, the createRun() method will add a new XWPFRun to the end of the list. insertNewRun(int) can instead be used to add a new XWPFRun at a specific point in the paragraph.

Once you have a XWPFRun, you can use the setText(String) method to make changes to the text. To add whitespace elements such as tabs and line breaks, it is necessary to use methods like addTab() and addCarriageReturn().

 

Further Examples

For now, there are a limited number of XWPF examples in the Examples Package. Beyond those, the best source of additional examples is in the unit tests. Browse the XWPF unit tests.

 


POI-HWPF - A Quick Guide

HWPF is still in early development. It is in the scratchpad section of the SVN. You will need to ensure you either have a recent SVN checkout, or a recent SVN nightly build (including the scratchpad jar!)

 

Basic Text Extraction

For basic text extraction, make use of org.apache.poi.hwpf.extractor.WordExtractor. It accepts an input stream or a HWPFDocument. The getText() method can be used to get the text from all the paragraphs, or getParagraphText() can be used to fetch the text from each paragraph in turn. The other option is getTextFromPieces(), which is very fast, but tends to return things that aren't text from the page. YMMV.

 

Specific Text Extraction

To get specific bits of text, first create a org.apache.poi.hwpf.HWPFDocument. Fetch the range with getRange(), then get paragraphs from that. You can then get text and other properties.

Headers and Footers

To get at the headers and footers of a word document, first create a org.apache.poi.hwpf.HWPFDocument. Next, you need to create a org.apache.poi.hwpf.usermodel.HeaderStores, passing it your HWPFDocument. Finally, the HeaderStores gives you access to the headers and footers, including first / even / odd page ones if defined in your document. Additionally, HeaderStores provides a method for removing any macros in the text, which is helpful as many headers and footers do end up with macros in them.

 

Changing Text

It is possible to change the text via insertBefore() and insertAfter() on a Range object (either a Range, Paragraph or CharacterRun). It is also possible to delete a Range. This code will work in many, but not all cases, and patches to improve it are gratefully received!

 

Further Examples

For now, the best source of additional examples is in the unit tests. Browse the HWPF unit tests.

为您推荐

友情链接 |九搜汽车网 |手机ok生活信息网|ok生活信息网|ok微生活
 Powered by www.360SDN.COM   京ICP备11022651号-4 © 2012-2016 版权