python - Trying to convert MSWord 2007 document to an XML format -


I hope I can leave history, but I trust the following:

    < Li> I have many people who have instant access to MSDD 2007.
  1. We are trying to present a generic word document which can go from person to person during several months and they Regardless of the answers given below for this "new content" - above mentioned There is no horrific idea, or a better idea than this can be ... I am already under this road: P
    1. My 'idea' setup (inside the word) was an XML schema so that we could specify specific content areas (such as item number, item description, item stem, item option, item response, etc.) < / Li>
    2. I taught myself XML schema in less than 6 hours, and apparently I am a terrible teacher: I have an XML Schema file, I have imported it in words, according to all the online tutorials I have Flagging areas I was able to hoop to save an "XML" file (from Word) and it looks like this:
       < Code> & lt; Note & gt; & Lt; Gt; & gt; Toe & lt; / Gt; & Lt; To & gt; Known & lt; /> & Lt; Title & gt; Reminder & lt; / Heading & gt; & Lt; Body & gt; Do not forget me at the end of this week! & Lt; / Body & gt; & Lt; / Comment & gt;    

      (To turn off just one random site, it has been shown that I wanted to save the saved data in the Word document from the XML document) < P> I was hoping that I can then parse the Python or send the XML file to a vendor who can upload the information in the database (and not - we can not just upload it in the database - it's a Word document From XML to Vendor)

      Issue: Whenever I save the file in XML from MSDord 2007, it gives me all this terrible awesome XML crap - I have seen that I can parse that my XML tags are embedded Looking forward to finding, and I find them, but it becomes very distorted by all the office tags / nonsense which would be a big waste of time in parsing it.

      Finally: How can I automatically fill words in XML tags (and automatically I understand that someone "Select text", "Assign XML" ... ' I'm talking more about saving 'out of XML') I develop with a schema (or can I make a sample XML tree without schema?) And export content for uploading / shipping ?

      Thank you for reading my short novel: P (Hopefully I was quite clear!)

      -J

      < Div class = "post-text" itemprop = "text">

      If the data will be in the form of uniforms you have given (i.e. just note element, with fixed fields) Word documents can be removed to keep a large table, from code> to , to , title , body , etc. . Then, you can parse it by using one of Python as described in the methods and your custom XML output because the .docx files are already XML, which can make your job easier or not.

      If the data is going to be more complex, then an idea can use the Word style. Map text on the right tags You can create a custom style for each tag, which is to click the user Quick and easy (and maybe a different color and / or font). Then while parsing the document, you can filter everything based on the applicable paragraph style. I am thinking that this path will be painful, though.

      Another option is to write a document in a structured syntax, which is easy to write / write manually and you can parse it by saving it as a plain text file, e.g.

        # plaintext_export.txt ------------------ Notes: - To: For anyone: Heading to someone else: This is a top message Is: & gt; Getting ready to prepare for your work, but it is not working yet, but only for working for them, they have to work just before work and work. - From: Another man: Heading me: Huh? Message: & gt; Some other message content will be as simple as parsing   

      :

        & gt; & Gt; & Gt; Import Yum & gt; & Gt; & Gt; Pprint import from pprint & gt; & Gt; & Gt; Open as ("plaintext_export.txt", 'r') as F: ... data = yaml.load (f) ... & gt; & Gt; & Gt; PIPRINT (DATA) {'NOTES': From [['' '': 'Any', 'Heading': 'This is a title', 'Message': 'Application for applicant, semi-made advertisement, work and laborer 'To' ':' any other than another '), {' from ':' another man ',' heading ':' huh? ',' Message ':' some other message content \ N ',' to ':' m '}]    

Comments