c# 读取 word 文档中的内容-凯发线上登陆下载网址

c# 读取 word 文档中的内容

从 word 文档中读取内容对于许多工作和学习任务至关重要。从 word 文档中读取一页的内容有助于快速浏览和摘要关键信息，从 word 文档中读取一个节的内容可帮助深入理解特定主题或部分内容，而从 word 文档中读取整个文档的内容则能够全面把握整体信息，促进综合分析和理解。本文将介绍如何使用 spire.doc for .net 在 c# 项目中从 word 文档中的读取一页的内容、一个节的内容和整个文档的内容。

安装 spire.doc for .net

首先，您需要将 spire.doc for.net 包含的 dll 文件作为引用添加到您的 .net项目中。dll 文件可以从此链接下载，也可以通过安装。

pm> install-package spire.doc

c# 从 word 文档中读取一页的内容

通过使用 fixedlayoutdocument 类和 fixedlayoutpage 类，可以轻松获取指定页面的内容。为了方便查看被提取的内容，这段示例代码将读取的内容存储到一个新的 word 文档中。详细步骤如下：

创建一个 document 对象。
使用 document.loadfromfile() 方法加载示例 word 文档。
创建一个 fixedlayoutdocument 对象。
获取文档中的一个页面的 fixedlayoutpage 对象。
通过 fixedlayoutpage.section 属性获取页面所在的节。
获取页面第一个段落在节中的索引位置。
获取页面最后一个段落在节中的索引位置。
创建另一个 document 对象。
使用 document.addsection() 添加一个新的节。
使用 section.clonesectionpropertiesto(newsection) 方法将原始节的属性克隆到新节中。
复制原文档中页面的内容到新文档中。
使用 document.savetofile() 方法保存结果文档。

using spire.doc;
using spire.doc.pages;
using spire.doc.documents;
namespace spiredocdemo
{
    internal class program
    {
        static void main(string[] args)
        {
            // 创建一个新的文档对象
            document document = new document();
            // 从指定文件加载文档内容
            document.loadfromfile("示例.docx");
            // 创建一个固定布局文档对象
            fixedlayoutdocument layoutdoc = new fixedlayoutdocument(document);
            // 获取第一个页面
            fixedlayoutpage page = layoutdoc.pages[0];
            // 获取页面所在的节
            section section = page.section;
            // 获取页面第一列的第一行的段落
            paragraph paragraphstart = page.columns[0].lines[0].paragraph;
            int startindex = 0;
            if (paragraphstart != null)
            {
                // 获取段落在节中的索引
                startindex = section.body.childobjects.indexof(paragraphstart);
            }
            // 获取页面第一列最后一行的段落
            paragraph paragraphend = page.columns[0].lines[page.columns[0].lines.count - 1].paragraph;
            int endindex = 0;
            if (paragraphend != null)
            {
                // 获取段落在节中的索引
                endindex = section.body.childobjects.indexof(paragraphend);
            }
            // 创建一个新的文档对象
            document newdoc = new document();
            // 添加一个新的节
            section newsection = newdoc.addsection();
            // 克隆原节的属性到新节
            section.clonesectionpropertiesto(newsection);
            // 复制原文档中一部分内容到新文档
            for (int i = 0; i < (endindex - startindex); i  )
            {
                newsection.body.childobjects.add(section.body.childobjects[i].clone());
            }
            // 将新文档保存为指定文件
            newdoc.savetofile("读取一页的内容.docx", spire.doc.fileformat.docx);
            // 关闭并释放新文档
            newdoc.close();
            newdoc.dispose();
            // 关闭并释放原文档
            document.close();
            document.dispose();
        }
    }
}

c# 读取 word 文档中的内容

c# 从 word 文档中读取一个节的内容

通过使用 document.sections[index] 可以读取到特定的 section 对象，它包含了页眉页脚和正文内容，此示例提供了将一个节的全部内容拷贝到其他的文档中的一个简单方式。详细步骤如下：

创建一个 document 对象。
使用 document.loadfromfile() 方法加载示例 word 文档。
使用 document.sections[1] 获取文档的第二个节。
创建另一个新的 document 对象。
使用 document.clonedefaultstyleto(newdoc) 方法克隆原始文档的默认样式到新文档。
使用 newdoc.sections.add(section.clone()) 方法将原始文档的第二个节的内容克隆到新文档中。
使用 document.savetofile() 方法保存结果文档。

using spire.doc;
namespace spiredocdemo
{
    internal class program
    {
        static void main(string[] args)
        {
            // 创建一个新的文档对象
            document document = new document();
            // 从文件加载word文档
            document.loadfromfile("示例.docx");
            // 获取文档的第二个节
            section section = document.sections[1];
            // 创建一个新的文档对象
            document newdoc = new document();
            // 将默认样式克隆到新文档
            document.clonedefaultstyleto(newdoc);
            // 将第二个节克隆到新文档中
            newdoc.sections.add(section.clone());
            // 将新文档保存为文件
            newdoc.savetofile("读取一个节的内容.docx", spire.doc.fileformat.docx);
            // 关闭并释放新文档对象
            newdoc.close();
            newdoc.dispose();
            // 关闭并释放原始文档对象
            document.close();
            document.dispose();
        }
    }
}

c# 读取 word 文档中的内容

c# 从 word 文档中读取整个文档的内容

此示例展示了通过遍历原始文档的每个节来实现读取整个文档的内容，并将每个节克隆到新文档中。详细步骤如下：

创建一个 document 对象。
使用 document.loadfromfile() 方法加载示例 word 文档。
创建另一个新的 document 对象。
使用 document.clonedefaultstyleto(newdoc) 方法克隆原始文档的默认样式到新文档。
使用一个 foreach 循环遍历原始文档中的每个节并将其克隆到新文档中。
使用 document.savetofile() 方法保存结果文档。

using spire.doc;
namespace spiredocdemo
{
    internal class program
    {
        static void main(string[] args)
        {
            // 创建一个新的文档对象
            document document = new document();
            // 从文件加载word文档
            document.loadfromfile("示例.docx");
            // 创建一个新的文档对象
            document newdoc = new document();
            // 将默认样式克隆到新文档
            document.clonedefaultstyleto(newdoc);
            // 遍历原始文档中的每个节并将其克隆到新文档中
            foreach (section sourcesection in document.sections)
            {
                newdoc.sections.add(sourcesection.clone());
            }
            // 将新文档保存为文件
            newdoc.savetofile("读取整个文档的内容.docx", spire.doc.fileformat.docx);
            // 关闭并释放新文档对象
            newdoc.close();
            newdoc.dispose();
            // 关闭并释放原始文档对象
            document.close();
            document.dispose();
        }
    }
}

c# 读取 word 文档中的内容

申请临时 license

如果您希望删除结果文档中的评估消息，或者摆脱功能限制，请该email地址已收到反垃圾邮件插件保护。要显示它您需要在浏览器中启用javascript。获取有效期 30 天的临时许可证。

c# 读取 word 文档中的内容-凯发线上登陆下载网址